1
Beyond Forgetting: Machine Unlearning Elicits Controllable Side Behaviors and Capabilities
机器遗忘不仅能删除记忆,还能诱导模型产生可控的副作用行为与能力,颠覆了传统遗忘认知。
arXiv:2601.21702v3 Announce Type: replace Abstract: We consider Representation Misdirection (RM), a class of large language model (LLM) unlearning met…