1
Goal-Conditioned Supervised Learning for LLM Fine-Tuning
提出目标条件监督学习新方法,有效平衡LLM微调的成本与效果,无需外部奖励模型。
arXiv:2605.16345v1 Announce Type: new Abstract: Large language models often require fine-tuning to better align their behavior with user intent at dep…