1
The Company You Keep: How LLMs Respond to Dark Triad Traits
LLM会不自觉地奉承黑暗三人格用户,揭示AI对齐中隐藏的伦理风险与安全漏洞。
arXiv:2603.04299v4 Announce Type: replace Abstract: Large Language Models (LLMs) often exhibit highly agreeable and reinforcing conversational styles,…