牛哥精选 · 所有

1

📝 深度技术 arXiv NLP 2026-07-14

PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs

针对长思维链大语言模型的KV缓存量化新突破，渐进混合精度方案，已被ICLR 2026接收，代码已开源。

arXiv:2505.18610v2 Announce Type: replace Abstract: Recently, significant progress has been made in developing reasoning-capable Large Language Models…

kv cache量化混合精度长思维链大语言模型渐进式量化

2

🤖 AI·大模型 Dev.to 2026-07-06

Sakana Fugu: How Collaborative AI is Changing the Game

多模型协作框架Fugu，协调GPT-5、Claude等专家模型，比单一巨模型更高效灵活

# Sakana Fugu: The Multi-Agent AI System That Works Like a Team We’ve all been there: copy-pasting a prompt from ChatGPT to Claude, and then to Gemini…

sakana fug 协作ai 多模型框架专家模型 iclr 2026

3

📝 深度技术 arXiv AI 2026-06-10

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

发现大模型在金融代理场景中为讨好用户而扭曲建议，一项测量迎合行为的新研究揭示潜在风险。

arXiv:2604.24668v3 Announce Type: replace Abstract: Given the increased use of LLMs in financial systems today, it becomes important to evaluate the s…

llm迎合行为金融代理大模型评估 ai伦理决策偏差

4

🤖 AI 工具 arXiv 机器学习 2026-06-09

Difference-Aware Retrieval Policies for Imitation Learning

一键直达ICLR 2026最新模仿学习论文，包含代码和演示，助你快速掌握差异感知检索策略

arXiv:2606.09758v1 Announce Type: cross Abstract: Parametric imitation learning via behavior cloning can suffer from poor generalization to out-of-dis…

机器学习模仿学习检索策略 arxiv预印本 iclr 2026

5

📝 深度技术 arXiv NLP 2026-06-04

Enhancing Hallucination Detection through Noise Injection

揭秘ICLR 2026新方法：通过注入噪声提升大模型幻觉检测能力，思路新颖且有效。

arXiv:2502.03799v4 Announce Type: replace Abstract: Large Language Models (LLMs) are prone to generating plausible yet incorrect responses, known as h…

幻觉检测噪声注入 iclr 2026 大语言模型鲁棒性

6

🤖 AI·大模型 arXiv AI 2026-06-04

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

SoLoPO提出短到长偏好优化，高效解锁大模型长上下文能力，被ICLR 2026录用。

arXiv:2505.11166v3 Announce Type: replace-cross Abstract: Despite advances in pretraining with extended context sizes, large language models (LLMs) st…

solopo 长上下文偏好优化大语言模型 iclr 2026

7

🤖 AI·大模型 arXiv 机器学习 2026-06-02

Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety

用质量多样性进化算法系统性挖掘LLM安全漏洞，突破传统对抗测试的局限

arXiv:2606.00801v1 Announce Type: cross Abstract: Current approaches to LLM adversarial testing suffer from coverage gaps: manual red-teaming does not…

llm安全质量多样性进化漏洞发现对抗测试 iclr 2026

8

📝 深度技术 arXiv 机器学习 2026-05-29

Uncertainty Estimation via Hyperspherical Confidence Mapping

ICLR 2026最新论文，提出超球面置信映射方法，高效提升深度学习模型的不确定性估计精度

arXiv:2605.05964v2 Announce Type: replace Abstract: Quantifying uncertainty in neural network predictions is essential for high-stakes domains such as…

不确定性估计超球面置信映射 iclr 2026 深度学习置信度校准

9

📝 深度技术 arXiv 计算机视觉 2026-05-22

Enhancing Visual Token Representations for Video Large Language Models via Training-Free Spatial-Temporal Pooling and Gridding

无需额外训练，通过空间-时间池化与网格化巧妙提升视频大语言模型视觉token表征，ICLR 2026接收！

arXiv:2605.22078v1 Announce Type: cross Abstract: Recent advances in Multimodal Large Language Models (MLLMs) have significantly advanced video unders…

视频大语言模型视觉令牌时空池化训练无关 iclr 2026

10

📝 深度技术 arXiv 机器学习 2026-05-20

Dr.LLM: Dynamic Layer Routing in LLMs

提出动态层路由机制，让LLM推理时跳过无关层，显著提升效率与精度。

arXiv:2510.12773v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) process every token through all layers of a transformer stack, …

dr.llm 动态层路由大语言模型推理优化稀疏路由

11

📝 深度技术 arXiv 计算机视觉 2026-05-20

Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex

提出DISK可微稀疏核复合体，实现高效空间可变卷积，已被ICLR 2026接收。

arXiv:2512.04556v2 Announce Type: replace-cross Abstract: Image convolution with complex kernels is a fundamental operation in photography, scientific…

空间可变卷积可微稀疏核 iclr 2026 深度学习卷积神经网络

12

📝 深度技术 arXiv NLP 2026-05-20

Mixture-of-Experts Can Surpass Dense LLMs Under Strictly Equal Resource

MoE架构在严格等资源条件下首次证明超越稠密大模型，ICLR 2026最新研究。

arXiv:2506.12119v2 Announce Type: replace Abstract: Mixture-of-Experts (MoE) language models dramatically expand model capacity and achieve remarkable…

moe 大模型资源效率 iclr 2026 深度学习

13

📝 深度技术 arXiv 机器学习 2026-05-20

Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance

ICLR 2026 顶会论文：用信息论指导消除奖励模型中的归纳偏置，为强化学习对齐提供更客观的评估基础

arXiv:2512.23461v2 Announce Type: replace Abstract: Reward models (RMs) are essential in reinforcement learning from human feedback (RLHF) to align la…

奖励模型归纳偏置信息论强化学习去偏方法

14

🤖 AI·大模型 arXiv 机器学习 2026-05-20

Hybrid Training for Vision-Language-Action Models

ICLR 2026论文提出混合训练框架，统一视觉-语言-动作模型，提升多模态具身智能表现。

arXiv:2510.00600v2 Announce Type: replace-cross Abstract: Using Large Language Models to produce intermediate thoughts, a.k.a. Chain-of-thought (CoT),…

vision-lan hybrid tra 多模态具身智能 iclr 2026

15

🤖 AI 工具 arXiv 机器学习 2026-05-20

Proximal Diffusion Neural Sampler

ICLR 2026接受的论文，用近端优化改进扩散模型，实现更高效的神经采样器，适合机器学习研究者。

arXiv:2510.03824v2 Announce Type: replace Abstract: The task of learning a diffusion-based neural sampler for drawing samples from an unnormalized tar…

扩散模型神经采样器近端优化 iclr 2026 概率采样

🐂 牛哥精选