牛哥精选 · 所有

📋 全部 ☁️ 云服务 🤖 AI 平台 🔗 API 中转 🔐 安全/认证 💳 支付 📧 通讯 📊 数据分析 🖼 媒体处理 🌐 域名/DNS

📝 深度技术 arXiv 机器学习 2026-07-02

MosaicKV: Serving Long-Context LLM with Dynamic Two-D KV Cache Compression

长上下文LLM推理提速新方法，MosaicKV通过动态二维KV缓存压缩，显著降低显存占用并保持精度。

arXiv:2607.00760v1 Announce Type: new Abstract: Long-context LLM services now sustain prompts with hundreds of thousands to millions of tokens, making…

长上下文 kv cache压缩 llm推理优化动态压缩二维缓存

🤖 AI·大模型 IT 之家 2026-06-30

英伟达在华启动机器人人才招聘，聚焦具身智能等四大方向

英伟达在华高薪招人，瞄准具身智能与人形机器人技术落地，布局四大方向加速AI计算平台应用。

IT之家 6 月 30 日消息，据《每日经济新闻》从英伟达处获悉，全球 AI 芯片巨头英伟达（NVIDIA）近日在中国启动大规模机器人人才招聘计划，围绕具身智能、仿真、部署及解决方案架构四大核心方向开放多个岗位，覆盖北京、上海、深圳三地。据报道，本次招聘中，具身智能团队岗位数量最多，共开放 6 个…

英伟达在华启动机器人人才招聘聚焦具身智能等四大方向

🤖 AI·大模型 arXiv AI 2026-06-10

RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM Inference

多步LLM推理新突破：通过推理感知KV缓存共享与自信提前退出机制，大幅提升效率，已被ICML 2026 Workshop收录。

arXiv:2606.09937v1 Announce Type: cross Abstract: We introduce RKSC (Reasoning-Aware KV Cache Sharing), a training-free inference framework that elimi…

rksc kv缓存共享自信提前退出多步推理 llm推理优化

🤖 AI·大模型 arXiv 机器学习 2026-06-02

Leyline: KV Cache Directives for Agentic Inference

突破传统KV缓存局限，专门针对Agentic LLM的推理优化，引入策略驱动编辑的新范式。

arXiv:2606.01065v1 Announce Type: cross Abstract: Modern KV cache management assumes the chatbot workload: prompts arrive once and the cache grows app…

kv缓存代理推理大语言模型缓存管理 ai推理优化

🎨 设计工具 arXiv 机器学习 2026-05-23

Manifold-Guided Attention Steering

嘿，你给的是学术论文链接，不是具体在线工具哦！我专为在线工具写推荐语，请提供工具名称或描述，比如“一个AI写作助手”或“图片去水印工具”，我马上按标准格式给你整好！

arXiv:2605.21770v1 Announce Type: new Abstract: Large language models frequently produce errors in reasoning tasks despite possessing the underlying k…

llm推理优化注意力引导激活编辑推理一致性在线演示

🤖 AI·大模型 arXiv AI 2026-05-19

Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints

提出流体引导的在线调度方法，在内存约束下优化LLM推理，显著降低延迟与运营成本

arXiv:2504.11320v3 Announce Type: replace-cross Abstract: Large language models now serve millions of users daily, with providers incurring costs exce…

llm推理 kv缓存调度优化内存约束 llm推理优化

📅 日期

2026-05-20 2026-05-19