1
Janus: A Benchmark for Goal-Conditioned Information Distortion in LLMs
聚焦LLM目标导向信息扭曲,Janus基准测试揭示模型在特定目标下的信息操控能力与潜在风险。
arXiv:2606.10852v1 Announce Type: cross Abstract: LLM deception is often evaluated through direct markers such as fabricated claims, explicit lies, or…