牛哥精选 · 本月

1

🤖 AI·大模型 arXiv AI 2026-06-11

The Power of Test-Time Training for Approximate Sampling

探索测试时训练在近似采样中的强大作用，为生成式AI推理难题提供新思路。

arXiv:2606.11437v1 Announce Type: cross Abstract: Efficiently sampling from a complex probability distribution is a fundamental problem which has beco…

测试时训练近似采样概率分布高效采样生成式ai

2

📝 深度技术 arXiv AI 2026-06-10

ASA: Backbone-Training-Free Representation Engineering for Tool-Calling Agents

无需训练骨干网络，用表示工程让LLM代理稳定应对工具调用变化，突破传统微调瓶颈。

arXiv:2602.04935v3 Announce Type: replace-cross Abstract: Adapting LLM agents to domain-specific tool calling remains notably brittle under evolving i…

llm agents 工具调用表示工程无需微调分布偏移

3

📝 深度技术 arXiv AI 2026-06-10

Dynamic Linear Attention

揭秘新型注意力机制：动态线性注意力，同时提升效率与精度，ICML 2026录用论文。

arXiv:2606.10650v1 Announce Type: cross Abstract: The scalability of Large Language Models (LLMs) to long contexts is fundamentally constrained by the…

动态线性注意力 transforme 注意力机制 icml 2026 效率优化

4

📝 深度技术 arXiv AI 2026-06-10

LC-QAT: Data-Efficient 2-Bit QAT for LLMs via Linear-Constrained Vector Quantization

LLM极低比特量化新突破，线性约束向量量化实现2比特数据高效训练，已被ICML 2026收录。

arXiv:2606.10531v1 Announce Type: cross Abstract: Quantization-aware training (QAT) is essential for extremely low-bit large language models (LLMs). C…

大模型量化 2比特向量量化 qat 数据高效

5

📝 深度技术 arXiv AI 2026-06-10

SHAPE: Coalition-Aware Expert Pruning for Sparse Mixture-of-Experts LLMs

稀疏MoE大模型部署新突破：引入联盟感知策略的专家剪枝方法

arXiv:2606.09886v1 Announce Type: cross Abstract: Sparse Mixture-of-Experts (MoE) large language models achieve strong quality with low per-token comp…

sparse moe 大语言模型专家剪枝联盟感知模型部署

6

📝 深度技术 arXiv AI 2026-06-10

Instruction Finetuning DeepSeek-R1-8B Model Using LoRA and NEFTune

探索用LoRA和NEFTune方法高效微调DeepSeek-R1-8B，降低资源消耗同时提升性能。

arXiv:2606.10392v1 Announce Type: new Abstract: Financial named-entity recognition (NER) is essential for translating unstructured financial reports a…

deepseek-r lora neftune 指令微调模型优化

7

🤖 AI·大模型 arXiv NLP 2026-06-10

WebChallenger: A Reliable and Efficient Generalist Web Agent

新论文提出WebChallenger，一个可靠高效的通用Web智能体，专为复杂网页任务设计。

arXiv:2606.10423v1 Announce Type: new Abstract: Autonomous web navigation remains challenging for LLM agents, and the strongest generalist systems rel…

web agent llm 自主导航低成本高效

8

📝 深度技术 arXiv 机器学习 2026-06-09

POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

ICML 2026 Oral论文，提出通过扩展正交变换实现大模型训练的内存高效方案。

arXiv:2603.05500v2 Announce Type: replace Abstract: Efficient and stable training of large language models (LLMs) remains a core challenge in modern m…

大模型训练内存优化正交变换 icml 2026 记忆高效

9

📝 深度技术 arXiv 机器学习 2026-06-09

Reformulate LLM Reinforcement Learning for Efficient Training under Black-box Discrepancy

当LLM强化学习遭遇黑盒差异，这篇论文提出重构框架实现更高效训练。

arXiv:2606.08779v1 Announce Type: new Abstract: Reinforcement Learning (RL) has emerged as a pivotal post-training paradigm, yet it frequently suffers…

llm 强化学习黑盒差异高效训练论文

10

📝 深度技术 arXiv 机器学习 2026-06-09

Sample-Efficient Post-Training for LEGO Spatial-Physics Reasoning

提出样本高效后训练方法，突破乐高空间物理推理任务的数据瓶颈

arXiv:2606.07602v1 Announce Type: new Abstract: LLM-based LEGO assembly generation requires both semantic grounding and physical feasibility. We ident…

lego 空间推理物理推理样本高效后训练

11

📄 文档手册 IT 之家 2026-06-08

到手价 209 元，大疆首款高效能氮化镓快充充电器 POWER 140W 发布

大疆首款140W氮化镓快充，三口输出支持PD3.1，到手价仅209元

IT之家 6 月 8 日消息，大疆今天发布品牌首款高效能氮化镓快充充电器 POWER 140W，新品主打三口输出，支持 PD 3.1 协议，自带 7A 数显数据线，预售到手价 209 元。据介绍，这款充电器的单口最高输出功率可达 140W，能够为手机、平板和笔记本快速补能。具备两个 USB-C…

到手价大疆首款高效能氮化镓快充充电器发布

12

🤖 AI 工具 Product Hunt 2026-06-04

Nemotron 3 Ultra by NVIDIA

NVIDIA推出的高性能推理加速器，专为长时间运行AI代理设计，速度更快、效率更高。

Powers faster, efficient reasoning for long-running agents Discussion | Link

ai推理加速长运行代理高效推理 nvidia 大模型优化

13

📝 深度技术 arXiv NLP 2026-06-04

Efficient Reasoning on the Edge

边缘设备高效推理新突破，兼顾性能与资源约束，适合部署场景研究者

arXiv:2603.16867v2 Announce Type: replace-cross Abstract: Large language models (LLMs) with chain-of-thought reasoning achieve state-of-the-art perfor…

边缘计算高效推理模型优化部署 arxiv论文

14

⚡ 效率工具 Dev.to 2026-06-04

The Cheapest Way to Self-Host Memos in 2026

每月3美元一键部署，30秒获取HTTPS实例，2026年自托管Memos的最省钱方案。

Last updated: June 2026 Memos is the lightweight, open-source notes app a lot of people land on after getting tired of flomo, Google Keep, or a paid N…

自托管 memos 低成本 vps部署笔记应用

15

📝 深度技术 arXiv AI 2026-06-03

Leave it to the Specialist: Repair Sparse LLMs with Sparse Fine-Tuning via Sparsity Evolution

提出通过稀疏性演化进行稀疏微调，高效修复稀疏大语言模型，平衡性能与计算开销。

arXiv:2505.24037v3 Announce Type: replace Abstract: Sparse large language models (LLMs) offer an attractive direction toward efficient deployment, but…

稀疏大模型稀疏微调稀疏性演化 llm修复模型压缩

16

🤖 AI 工具 IT 之家 2026-06-03

继 Phi-4-mini 后：微软宣布为 Edge 浏览器引入更高效的 Aion-1.0-Instruct 模型与翻译 API，支持 145 种以上语言

微软Edge内嵌Aion-1.0-Instruct小模型与翻译API，支持145种语言，端侧运行无需云服务，赋能开发者打造AI原生Web体验

IT之家 6 月 3 日消息，在今日开幕的 Build 2026 开发者大会上，微软宣布在去年为 Edge 浏览器推出基于 Phi-4-mini 模型的写作辅助 API 基础上扩展了其端侧 AI 能力，新增了模型和 API。本次更新主要包括三项内容： Aion-1.0-Instruct 小语言模型的…

微软宣布为浏览器引入更高效的模型与翻译支持

17

🤖 AI·大模型 arXiv AI 2026-06-03

E2LLM: Towards Efficient LLM Serving in Heterogeneous Edge/Fog Environments

异构边缘/雾环境中大模型高效服务的新方案，解决部署延迟与资源优化难题。

arXiv:2606.03770v1 Announce Type: cross Abstract: Large Language Models (LLMs) have become integral to modern applications, yet their deployment remai…

e2llm llm服务边缘计算异构环境雾计算

18

📝 深度技术 arXiv AI 2026-06-03

Libra: Efficient Resource Management for Agentic RL Post-Training

高效管理Agentic RL后训练资源的新方案Libra，降低训练成本、提升性能。

arXiv:2606.03077v1 Announce Type: cross Abstract: Reinforcement learning (RL) has become a standard post-training paradigm for large language models (…

libra 资源管理 agentic rl 后训练强化学习

19

🤖 AI·大模型 arXiv 机器学习 2026-06-02

Efficient LLM Moderation with Multi-Layer Latent Prototypes

提出多层潜在原型方法，高效提升LLM内容审核的准确性与速度

arXiv:2502.16174v4 Announce Type: replace Abstract: Although modern LLMs are aligned with human values during post-training, robust moderation remains…

llm modera 多层潜在原型大模型安全高效审核 arxiv论文

20

📝 深度技术 arXiv 机器学习 2026-06-01

Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs

提出可学习的零阶优化器，无需梯度即可高效微调大模型，大幅降低内存开销。

arXiv:2510.00419v2 Announce Type: replace Abstract: Zeroth-order optimizers have recently emerged as an attractive approach for fine-tuning large lang…

零阶优化大模型微调梯度-free优化内存优化 llm

🐂 牛哥精选

The Power of Test-Time Training for Approximate Sampling

ASA: Backbone-Training-Free Representation Engineering for Tool-Calling Agents

Dynamic Linear Attention

LC-QAT: Data-Efficient 2-Bit QAT for LLMs via Linear-Constrained Vector Quantization

SHAPE: Coalition-Aware Expert Pruning for Sparse Mixture-of-Experts LLMs

Instruction Finetuning DeepSeek-R1-8B Model Using LoRA and NEFTune

WebChallenger: A Reliable and Efficient Generalist Web Agent

POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

Reformulate LLM Reinforcement Learning for Efficient Training under Black-box Discrepancy

Sample-Efficient Post-Training for LEGO Spatial-Physics Reasoning

到手价 209 元，大疆首款高效能氮化镓快充充电器 POWER 140W 发布

Nemotron 3 Ultra by NVIDIA

Efficient Reasoning on the Edge

The Cheapest Way to Self-Host Memos in 2026

Leave it to the Specialist: Repair Sparse LLMs with Sparse Fine-Tuning via Sparsity Evolution

继 Phi-4-mini 后：微软宣布为 Edge 浏览器引入更高效的 Aion-1.0-Instruct 模型与翻译 API，支持 145 种以上语言

E2LLM: Towards Efficient LLM Serving in Heterogeneous Edge/Fog Environments

Libra: Efficient Resource Management for Agentic RL Post-Training

Efficient LLM Moderation with Multi-Layer Latent Prototypes

Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs

📅 日期