牛哥精选 · 本月

1

🤖 AI·大模型 arXiv 机器学习 2026-05-20

Goal-Conditioned Supervised Learning for LLM Fine-Tuning

提出目标条件监督学习新方法，有效平衡LLM微调的成本与效果，无需外部奖励模型。

arXiv:2605.16345v1 Announce Type: new Abstract: Large language models often require fine-tuning to better align their behavior with user intent at dep…

llm微调目标条件学习监督学习对齐成本优化

2

📝 深度技术 arXiv 机器学习 2026-05-20

Augmenting Human Evaluation with LLM Judges: How Many Human Reviews Do You Need?

探讨如何用LLM评估人效，量化所需人类评审数量，高效平衡AI系统评估的成本与质量。

arXiv:2605.16354v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used as automated evaluators of AI systems, including in…

llm评估人工评估模型评估自动化评价成本效益

3

📝 深度技术 arXiv 机器学习 2026-05-20

LEAF: A Living Benchmark for Event-Augmented Forecasting

针对预测任务，LEAF动态基准填补了多维事件评估空白，让大模型预测能力测试更贴近现实。

arXiv:2605.16358v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly applied to forecasting. To evaluate this capability whil…

机器学习事件增强预测活基准大模型评估时间序列预测

4

📝 深度技术 arXiv 机器学习 2026-05-20

Mixing Times of Glauber Dynamics on Masked Language Models

从统计物理视角分析遮蔽语言模型中Glauber动力学的混合时间，为理解MLM的采样行为提供理论依据。

arXiv:2605.16378v1 Announce Type: new Abstract: Masked language models (MLMs) define local conditional distributions over tokens but do not, in genera…

glauber动力学遮蔽语言模型混合时间马尔可夫链统计力学

5

📝 深度技术 arXiv 机器学习 2026-05-20

A Theory of Training Profit-Optimal LLMs

从损失到利润：揭示大规模训练LLM的成本收益最优解，为AI投入产出提供理论框架。

arXiv:2605.16430v1 Announce Type: new Abstract: Scaling LLMs requires tremendous computational resources, and recent advances in AI have gone hand in …

llm 利润优化成本收益缩放定律经济模型

6

📝 深度技术 arXiv 机器学习 2026-05-20

Nested Spatio-Temporal Time Series Forecasting

ICML 2026接收，提出嵌套时空时序预测新方法，实现多层次时空数据的精准建模

arXiv:2605.16447v1 Announce Type: new Abstract: Spatiotemporal forecasting is critical for real-world applications like traffic management, yet captur…

时空预测时间序列嵌套模型 icml 2026 深度学习

7

📝 深度技术 arXiv 机器学习 2026-05-20

Wavelet Flow Matching for Multi-Scale Physics Emulation

多尺度物理模拟的流匹配小波方法，高效生成高保真物理场。

arXiv:2605.16573v1 Announce Type: new Abstract: Accurate emulation of multi-scale physical systems governed by PDEs demands models that remain stable …

wavelet fl 多尺度物理模拟流匹配小波物理仿真

8

📝 深度技术 arXiv 机器学习 2026-05-20

Structure-Aware Masking for Protein Representation Learning

突破性结构感知掩码方法，提升蛋白质表征学习效能，为AI制药与蛋白质设计提供新思路

arXiv:2605.16581v1 Announce Type: new Abstract: Masked language modeling (MLM) is the standard objective for training protein language models, typical…

蛋白质表示学习结构感知掩码深度学习 ai for sci 自监督学习

9

📝 深度技术 arXiv 机器学习 2026-05-20

R2V Agent: Teaching SLMs When to Ask for Help

教你如何让小型语言模型学会判断何时该“求救”，避免盲目依赖昂贵大模型，提升Agent系统效率的突破性研究。

arXiv:2605.16604v1 Announce Type: new Abstract: Efficient agentic systems should incur expensive frontier-model costs only on decisions where a cheape…

slm agent 大模型优化任务路由效率提升

10

💰 商业科技 arXiv 机器学习 2026-05-20

Your SaaS Is an Insurance Product: A Modeling Framework

揭秘SaaS产品与保险的结构相似性，一个商业建模新视角。

arXiv:2605.16699v1 Announce Type: new Abstract: Capped-usage SaaS products -- LLM subscriptions such as Claude Code and ChatGPT, cloud platforms such …

saas 保险定价模型风险管理产品框架

11

📝 深度技术 arXiv 机器学习 2026-05-20

Convex Dataset Valuation for Post-Training

提出凸数据集估值方法，解决LLM后训练中数据集选择的成本与性能权衡问题

arXiv:2605.16704v1 Announce Type: new Abstract: Improving LLM performance on downstream tasks sometimes requires leveraging auxiliary datasets during …

数据集估值后训练凸优化 llm 数据选择

12

📝 深度技术 arXiv 机器学习 2026-05-20

EmoMind: Decoding Affective Captions from Human Brain fMRI

从脑部fMRI信号解码情感描述，AI与神经科学跨界新突破。

arXiv:2605.16739v1 Announce Type: new Abstract: Decoding visual experience from brain activity has advanced substantially, but cur- rent brain-to-text…

fmri 情感解码脑机接口 affective ai神经科学

13

📝 深度技术 arXiv 机器学习 2026-05-20

Propagation of Chaos in Contextual Flow Maps

探索混沌传播在上下文流图中的理论机制，为复杂系统建模提供新视角。

arXiv:2605.16747v1 Announce Type: new Abstract: We develop a quantitative statistical theory of transformers in the large-context regime by adopting t…

混沌传播上下文流图复杂系统理论模型 arxiv论文

14

📝 深度技术 arXiv 机器学习 2026-05-20

The Unlearnability Phenomenon in RLVR for Language Models

揭示RLVR训练中LLM对困难样本无法学习的反直觉现象，挑战现有认知

arXiv:2605.16787v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Reward (RLVR) has proven effective in improving Large Language …

rlvr 不可学习性语言模型强化学习推理能力

15

🤖 AI·大模型 arXiv 机器学习 2026-05-20

Decoupling KL and Trajectories: A Unified Perspective for SFT, DAgger, Offline RL, and OPD in LLM Distillation

一篇统一SFT、DAgger、离线RL和OPD视角的LLM蒸馏论文，解耦KL与轨迹，为模型优化提供新理论框架。

arXiv:2605.16826v1 Announce Type: new Abstract: Knowledge distillation is central to LLM post-training, yet its design space remains poorly understood…

llm蒸馏 kl散度监督微调强化学习轨迹优化

16

🤖 AI·大模型 arXiv 机器学习 2026-05-20

BoLT: A Benchmark to Democratize Black-box Optimization Research for Expensive LLM Tasks

面向大语言模型昂贵任务的黑盒优化基准BoLT，降低研究门槛，推动领域民主化。

arXiv:2605.17000v1 Announce Type: new Abstract: Optimization of LLM training and inference configurations, such as hyperparameters, data mixtures, and…

bolt 黑盒优化大语言模型基准测试 llm

17

📝 深度技术 arXiv 机器学习 2026-05-20

Learning-Zone Energy: Online Data Selection for Efficient RL Post-Training

提出Learning-Zone Energy方法，在线选择数据以提升RL后训练效率，避免均匀分配浪费计算。

arXiv:2605.17003v1 Announce Type: new Abstract: Reinforcement Learning (RL) post-training has emerged as the dominant paradigm for eliciting mathemati…

数据选择强化学习后训练大语言模型计算优化

18

🤖 AI·大模型 arXiv 机器学习 2026-05-20

Privacy Policy Enforcement Guardrails for Data-Sensitive Retrieval-Augmented Generation

针对RAG系统数据泄露，提出隐私政策执行（PPE）框架，用双密度估计器与嵌入融合检测非规则属性聚类。

arXiv:2605.17034v1 Announce Type: new Abstract: Standard PII filters often miss contextual data leakage in RAG systems, such as non-regulated attribut…

rag系统隐私政策执行数据泄露密度估计文本嵌入

19

📝 深度技术 arXiv 机器学习 2026-05-20

D$^2$Evo: Dual Difficulty-Aware Self-Evolution for Data-Efficient Reinforcement Learning

提出双难度感知自进化方法，解决强化学习训练数据稀缺与动态难度转移的挑战。

arXiv:2605.17037v1 Announce Type: new Abstract: Reinforcement learning (RL) has demonstrated potential for enhancing reasoning in large language model…

强化学习大语言模型数据效率自适应难度自进化

20

📝 深度技术 arXiv 机器学习 2026-05-20

S-Bus: Automatic Read-Set Reconstruction for Multi-Agent LLM State Coordination

多智能体LLM状态协调新方案，自动读集重建无需改动SDK

arXiv:2605.17076v1 Announce Type: new Abstract: Concurrent LLM agents sharing mutable natural-language state produce Structural Race Conditions (SRCs)…

s-bus http中间件多智能体 llm 状态协调

🐂 牛哥精选

Goal-Conditioned Supervised Learning for LLM Fine-Tuning

Augmenting Human Evaluation with LLM Judges: How Many Human Reviews Do You Need?

LEAF: A Living Benchmark for Event-Augmented Forecasting

Mixing Times of Glauber Dynamics on Masked Language Models

A Theory of Training Profit-Optimal LLMs

Nested Spatio-Temporal Time Series Forecasting

Wavelet Flow Matching for Multi-Scale Physics Emulation

Structure-Aware Masking for Protein Representation Learning

R2V Agent: Teaching SLMs When to Ask for Help

Your SaaS Is an Insurance Product: A Modeling Framework

Convex Dataset Valuation for Post-Training

EmoMind: Decoding Affective Captions from Human Brain fMRI

Propagation of Chaos in Contextual Flow Maps

The Unlearnability Phenomenon in RLVR for Language Models

Decoupling KL and Trajectories: A Unified Perspective for SFT, DAgger, Offline RL, and OPD in LLM Distillation

BoLT: A Benchmark to Democratize Black-box Optimization Research for Expensive LLM Tasks

Learning-Zone Energy: Online Data Selection for Efficient RL Post-Training

Privacy Policy Enforcement Guardrails for Data-Sensitive Retrieval-Augmented Generation

D$^2$Evo: Dual Difficulty-Aware Self-Evolution for Data-Efficient Reinforcement Learning

S-Bus: Automatic Read-Set Reconstruction for Multi-Agent LLM State Coordination

📅 日期