1
Fine-tuning GPT-2 from human preferences
OpenAI分享用人类反馈微调GPT-2(774M参数)的实践,发现模型学会复制原文来迎合标注者偏好,揭示了偏好对齐中的反直觉现象。
We’ve fine-tuned the 774M parameter GPT-2 language model using human feedback for various tasks, successfully matching the preferences of the external…