牛哥精选 · 所有

📋 全部 ☁️ 云服务 🤖 AI 平台 🔗 API 中转 🔐 安全/认证 💳 支付 📧 通讯 📊 数据分析 🖼 媒体处理 🌐 域名/DNS

📝 深度技术 arXiv NLP 2026-06-02

DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding

LLM推理新突破：块扩散推测解码通过同时预测整个块来加速验证，提升模型效率

arXiv:2606.02091v1 Announce Type: new Abstract: Block diffusion speculative decoding accelerates LLM inference by predicting all tokens within a block…

llm推理加速推测解码块扩散目标模型草稿模型

📝 深度技术 arXiv AI 2026-05-27

HiSpec: Hierarchical Speculative Decoding for LLMs

针对大模型推理中投机解码的验证瓶颈，提出层次化投机解码方法，显著提升推理速度。

arXiv:2510.01336v2 Announce Type: replace-cross Abstract: Speculative decoding accelerates LLM inference by using a smaller draft model to speculate t…

投机解码 llm推理层次化加速验证瓶颈草稿模型

📝 深度技术 arXiv AI 2026-05-21

Exploring and Developing a Pre-Model Safeguard with Draft Models

用草稿模型在推理前拦截有害输出，为AI安全提供轻量级新方案

arXiv:2605.19321v1 Announce Type: cross Abstract: Large Language Model (LLM) alignment remains vulnerable to jailbreak attacks that elicit unsafe resp…

ai安全大模型安全草稿模型预模型防护 arxiv论文

📅 日期

2026-05-20 2026-05-19

🐂 牛哥精选

DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding

HiSpec: Hierarchical Speculative Decoding for LLMs

Exploring and Developing a Pre-Model Safeguard with Draft Models

📅 日期