牛哥精选 · 所有

📋 全部 ☁️ 云服务 🤖 AI 平台 🔗 API 中转 🔐 安全/认证 💳 支付 📧 通讯 📊 数据分析 🖼 媒体处理 🌐 域名/DNS

📝 深度技术 arXiv 机器学习 2026-05-19

Reducing the Safety Tax in LLM Safety Alignment with On-Policy Self-Distillation

论文提出on-policy self-distillation方法，在不牺牲推理能力的前提下降低LLM安全对齐中的“安全税”。

arXiv:2605.15239v1 Announce Type: new Abstract: Safety alignment often improves robustness to harmful queries at the cost of reasoning ability, a trad…

llm安全对齐 on-policy自安全税分布不匹配推理能力

📅 日期

2026-05-20 2026-05-19

🐂 牛哥精选

Reducing the Safety Tax in LLM Safety Alignment with On-Policy Self-Distillation

📅 日期