1
SLEIGHT-Bench: A Benchmark of Evasion Attacks Against Agent Monitors
首个专攻AI Agent监控器逃避攻击的基准测试,揭示安全漏洞新维度。
arXiv:2605.16626v2 Announce Type: replace-cross Abstract: Since autonomous coding agents generate complex behaviors at high-volume, we may want to use…