1
DECOR: Auditing LLM Deception via Information Manipulation Theory
基于信息操纵理论审计大模型欺骗行为,新方法直击AI安全盲区。
arXiv:2605.19270v1 Announce Type: new Abstract: Large language models can deceive by subtly manipulating truthful information -- omitting key facts, s…