1
Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks
顶会论文新发现:用LLM重写对抗数据投毒,一种温和却有效的后门攻击防御策略。
arXiv:2605.19147v1 Announce Type: cross Abstract: Large language models (LLMs) are highly susceptible to backdoor attacks (BAs), wherein training samp…