1
Probing Persona-Dependent Preferences in Language Models
大型语言模型偏好探测研究:揭示不同人设下模型行为差异,对调校AI对齐有启示
arXiv:2605.13339v2 Announce Type: replace Abstract: Large language models (LLMs) can be said to have preferences: they reliably pick certain tasks and…