1
Evaluation Drift in LLM Personality Induction: Are We Moving the Goalpost?
研究揭示LLM人格测试中评估标准悄悄变化,挑战现有结论可靠性
arXiv:2605.16996v1 Announce Type: new Abstract: Can large language models reliably express a human-like personality, or are they merely mimicking surf…