牛哥精选 · 本周

1

🤖 AI·大模型 arXiv AI 2026-05-25

Skill Retrieval Augmentation for Agentic AI

让LLM智能体学会检索外部技能，摆脱显式枚举限制，提升复杂任务处理能力

arXiv:2604.24594v2 Announce Type: replace-cross Abstract: As large language models (LLMs) evolve into agentic problem solvers, they increasingly rely …

技能检索增强 agent ai 大语言模型外部技能复用 llm agent

2

📝 深度技术 arXiv AI 2026-05-23

Can AI Make Conflicts Worse? An Alignment Failure in LLM Deployment Across Conflict Contexts

新研究揭示AI模型在武装冲突地区部署时，可能因对齐失败而无意中激化矛盾，对AI安全与全球治理敲响警钟。

arXiv:2605.22720v1 Announce Type: new Abstract: AI models are already deployed in societies affected by armed conflict, and journalists, humanitarian …

ai安全冲突伦理对齐失败大型语言模型

3

📝 深度技术 arXiv AI 2026-05-21

Probing Embodied LLMs: When Higher Observation Fidelity Hurts Problem Solving

提升观察精度反而降低问题解决能力——这项研究挑战了具身LLM的传统认知，揭示保真度与推理之间的意外权衡。

arXiv:2605.20072v1 Announce Type: new Abstract: Large Language Models are increasingly proposed as cognitive components for robotic systems, yet their…

具身智能大语言模型观察保真度问题解决感知推理

4

🤖 AI·大模型 arXiv NLP 2026-05-20

Measuring Stereotype and Deviation Biases in Large Language Models

最新研究揭示LLM中两类微妙偏见——刻板印象与偏离，量化评估方法出炉

arXiv:2508.06649v3 Announce Type: replace Abstract: Large language models (LLMs) are widely applied across diverse domains, raising concerns about the…

大语言模型偏见测量刻板印象偏差偏见 ai公平性

5

📝 深度技术 arXiv 机器学习 2026-05-20

When the Loop Closes: Architectural Limits of In-Context Isolation, Metacognitive Co-option, and the Two-Target Design Problem in Human-LLM Systems

人类将认知自我调节外包给大模型，48小时内系统崩溃——揭示LLM交互系统的架构限制与元认知共选问题。

arXiv:2604.15343v2 Announce Type: replace-cross Abstract: We report a detailed autoethnographic case study of a single-subject who deliberately constr…

大型语言模型认知外化提示工程元认知架构限制

6

📝 深度技术 arXiv NLP 2026-05-20

The Homogenization Problem in LLMs: Towards Meaningful Diversity in AI Safety

大模型同质化如何威胁AI安全？这篇论文系统剖析了症结与出路。

arXiv:2601.06116v5 Announce Type: replace-cross Abstract: Generative AI models reproduce the human biases in their training data and further amplify t…

大型语言模型 ai安全同质化多样性模式坍缩

7

🤖 AI·大模型 arXiv NLP 2026-05-20

Generative Artificial Intelligence for Literature Reviews

大模型驱动的生成式AI正颠覆传统文献综述流程，摘要、问答、数据提取等能力让科研效率起飞。

arXiv:2605.16475v1 Announce Type: cross Abstract: Generative artificial intelligence (GenAI), based on large-language models (LLMs), such as ChatGPT, …

生成式人工智能文献综述大型语言模型 chatgpt 学术研究

8

🤖 AI·大模型 arXiv NLP 2026-05-20

WASIL: In-the-Wild Arabic Spoken Interactions with LLMs

首个阿拉伯语真实口语交互数据集，专为研究LLM语音助手中ASR错误影响而构建，填补领域空白。

arXiv:2605.16364v1 Announce Type: cross Abstract: Large Language Models (LLMs) voice assistants are commonly built as cascaded Automatic Speech recogn…

阿拉伯语语音交互大型语言模型数据集 asr

9

🤖 AI·大模型 arXiv NLP 2026-05-20

Language Acquisition Device in Large Language Models

探讨如何借鉴语言习得装置，通过合成语言预训练提升大模型的数据效率，为AI发展带来新思路。

arXiv:2605.16758v1 Announce Type: new Abstract: Large Language Models (LLMs) remain substantially less data-efficient than humans. Pre-pretraining (PP…

大型语言模型语言习得数据效率合成语言预训练

10

🤖 AI·大模型 arXiv AI 2026-05-20

Efficient Multi-objective Prompt Optimization via Pure-exploration Bandits

用纯探索强盗算法实现多目标提示优化，摆脱单一指标局限，高效挖掘LLM最佳提示

arXiv:2605.14553v1 Announce Type: cross Abstract: Prompt engineering has become central to eliciting the capabilities of large language models (LLMs).…

多目标优化提示工程大型语言模型强盗算法

11

📝 深度技术 arXiv 机器学习 2026-05-20

Prune, Update and Trim: Robust Structured Pruning for Large Language Models

提出新型结构化剪枝方法，实现大模型高效压缩同时保持鲁棒性，适合模型优化研究者

arXiv:2605.18331v1 Announce Type: new Abstract: Large Language Models (LLMs) have experienced significant growth and development in recent years. Howe…

大型语言模型结构化剪枝模型压缩鲁棒性剪枝方法

12

📝 深度技术 arXiv AI 2026-05-19

AIPO: Learning to Reason from Active Interaction

突破现有强化学习局限，提出通过主动交互提升大模型推理能力的新方法。

arXiv:2605.08401v2 Announce Type: replace-cross Abstract: Recent advances in large language models (LLMs) have demonstrated remarkable reasoning capab…

大语言模型推理强化学习主动交互 aipo

13

📝 深度技术 arXiv AI 2026-05-19

Dynamics of the Transformer Residual Stream: Coupling Spectral Geometry to Network Topology

将Transformer深度视为离散时间，揭示残差流中的谱几何与网络拓扑耦合机制，为理解大模型计算传播提供新视角。

arXiv:2605.14258v1 Announce Type: cross Abstract: Large language models are remarkably capable, yet how computation propagates through their layers re…

transforme 残差流动力学谱几何网络拓扑

14

📝 深度技术 arXiv NLP 2026-05-19

Toward LLMs Beyond English-Centric Development

揭示大模型英语偏见真相，证明持续预训练成本优势不存在，语言专用投资或成必然。

arXiv:2605.15613v1 Announce Type: new Abstract: Through an analysis of sequences generated by open-weight large language models (LLMs), we demonstrate…

大型语言模型英语偏见多语言持续预训练文化理解

15

🤖 AI·大模型 arXiv AI 2026-05-19

Reasoners or Translators? Contamination-aware Evaluation and Neuro-Symbolic Robustness in Tax Law

最新研究：LLM在税法推理中存在数据污染风险，别被“假懂”骗了！

arXiv:2605.16052v1 Announce Type: new Abstract: Recent advances in large language models (LLMs) have significantly enhanced automated legal reasoning.…

大型语言模型法律推理数据污染神经符号鲁棒性税法

16

🤖 AI·大模型 arXiv NLP 2026-05-19

Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training

最大规模伦理数据集Common Corpus发布，为LLM预训练提供高质量合规数据

arXiv:2506.01732v3 Announce Type: replace Abstract: Large Language Models (LLMs) are pre-trained on large amounts of data from different sources and d…

llm 数据集预训练伦理 commoncorp

🐂 牛哥精选

Skill Retrieval Augmentation for Agentic AI

Can AI Make Conflicts Worse? An Alignment Failure in LLM Deployment Across Conflict Contexts

Probing Embodied LLMs: When Higher Observation Fidelity Hurts Problem Solving

Measuring Stereotype and Deviation Biases in Large Language Models

When the Loop Closes: Architectural Limits of In-Context Isolation, Metacognitive Co-option, and the Two-Target Design Problem in Human-LLM Systems

The Homogenization Problem in LLMs: Towards Meaningful Diversity in AI Safety

Generative Artificial Intelligence for Literature Reviews

WASIL: In-the-Wild Arabic Spoken Interactions with LLMs

Language Acquisition Device in Large Language Models

Efficient Multi-objective Prompt Optimization via Pure-exploration Bandits

Prune, Update and Trim: Robust Structured Pruning for Large Language Models

AIPO: Learning to Reason from Active Interaction

Dynamics of the Transformer Residual Stream: Coupling Spectral Geometry to Network Topology

Toward LLMs Beyond English-Centric Development

Reasoners or Translators? Contamination-aware Evaluation and Neuro-Symbolic Robustness in Tax Law

Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training

📅 日期