1
A Systematic Investigation of RL-Jailbreaking in LLMs
系统研究强化学习对LLM的越狱攻击,揭示AI安全新风险,值得关注
arXiv:2605.07032v2 Announce Type: replace-cross Abstract: The evolution of generative models from next-token predictors to autonomous engines of compl…