1
VERA: Variational Inference Framework for Jailbreaking Large Language Models
NeurIPS 2025论文:用变分推理框架系统性生成对抗性提示,揭示LLM安全漏洞,方法新颖理论扎实。
arXiv:2506.22666v3 Announce Type: replace-cross Abstract: The rise of API-only access to state-of-the-art LLMs highlights the need for effective black…