牛哥精选 · 三个月

📋 全部 🤖 AI·大模型 ⚡ 效率工具 📝 深度技术 🚀 产品观察 💰 商业科技 🔓 开源项目 🎨 设计创意 📖 阅读推荐 🏷 资源合集 🌱 成长效率

🤖 AI·大模型 arXiv 机器学习 2026-07-09

Towards Understanding Steering Strength

ICML 2026论文深入解析模型引导技术中的强度量化问题，为AI可解释性提供新视角。

arXiv:2602.02712v2 Announce Type: replace Abstract: A popular approach to post-training control of large language models (LLMs) is the steering of int…

steering s ai可解释性模型引导大模型控制 icml 2026

🤖 AI·大模型 arXiv AI 2026-05-19

Graph-Regularized Sparse Autoencoders for LLM Safety Steering

提出图正则化稀疏自编码器，提升大模型安全行为干预的精准度。

arXiv:2512.06655v3 Announce Type: replace-cross Abstract: Sparse autoencoders (SAEs) are increasingly used to extract activation directions for infere…

graph-regu sparse aut llm safety steering activation

🤖 AI·大模型 Hacker News AI 2026-05-19

You can't whisper at an AI agent

与AI agent对话，清晰指令比窃窃私语更有效。这篇来自Stripe团队的实践洞察，揭示了代理交互中指令粒度与系统响应的微妙关系——你的"呢喃"可能被当作噪声，而明确意图才是驱动智能的关键。

Article URL: https://stripe.dev/blog/ai-steering-experiments Comments URL: https://news.ycombinator.com/item?id=48162696 Points: 1 # Comments: 1

ai agent whisper steering e stripe developer

📅 日期

2026-05-20 2026-05-19

🐂 牛哥精选

Towards Understanding Steering Strength

Graph-Regularized Sparse Autoencoders for LLM Safety Steering

You can't whisper at an AI agent

📅 日期