1
How Reliable Are AI Attackers Against a Fixed Vulnerable Target? A 400-Run Empirical Study of LLM Penetration Testing Consistency
400次重复实验揭示:大模型做黑客竟如此「不稳定」?首个LLM渗透测试一致性量化研究。
arXiv:2605.30096v1 Announce Type: cross Abstract: Large language models (LLMs) can autonomously conduct multi-stage cyber attacks, but the consistency…