1
Alignment Dynamics in LLM Fine-Tuning
揭秘LLM微调中对齐为何脆弱:从参数动态到输出分布的统一视角
arXiv:2605.18309v1 Announce Type: new Abstract: Although Large Language Models (LLMs) achieve strong alignment through supervised fine-tuning and rein…
揭秘LLM微调中对齐为何脆弱:从参数动态到输出分布的统一视角
arXiv:2605.18309v1 Announce Type: new Abstract: Although Large Language Models (LLMs) achieve strong alignment through supervised fine-tuning and rein…