牛哥精选 · 半年

1

🤖 AI·大模型 arXiv 计算机视觉 2026-07-15 NEW

Authoring for Living Worlds: Tool-Constrained LLM Agents for Executable Multi-Actor Scenarios

AI代理自动生成3D游戏可执行的叙事场景，让非专业人士也能创作动态多角色视频。

arXiv:2604.10383v2 Announce Type: replace Abstract: We use LLM agents to author executable specifications for a living world: formal Graphs of Events …

llm agents 3d叙事游戏引擎场景生成可执行规范

2

🤖 AI·大模型 arXiv AI 2026-07-14

AgentCheck: A Reproduce-Intervene-Mitigate Workbench for LLM Agents over MCP

为LLM Agents设计的工作台，支持重现、干预与缓解，提升MCP环境下Agent可靠性。

arXiv:2607.11098v1 Announce Type: cross Abstract: Tool-using LLM agents are mostly evaluated assuming all tools work. When a tool times out, returns a…

agentcheck llm agents mcp 工作台可靠性

3

📝 深度技术 arXiv AI 2026-07-13

Toward Auditable AI Scientists: A Hypothesis Evolution Protocol for LLM Agents

为LLM智能体设计可审计的假设演化协议，让AI科学家过程透明可信

arXiv:2607.09195v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly expected to play a central role in AI-driven scient…

llm agents 可审计ai 假设演化协议科学发现 ai科研透明度

4

📝 深度技术 arXiv AI 2026-07-13

The LLMbda Calculus: AI Agents, Conversations, and Information Flow

提出LLMbda演算，用形式化方法建模AI代理对话中的信息流，理论深度与实用性兼备。

arXiv:2602.20064v2 Announce Type: replace-cross Abstract: Large language models are increasingly deployed as agents: they plan, call tools, read untru…

lambda cal ai agents 对话系统信息流大语言模型

5

⚡ 效率工具 Dev.to 2026-07-10

AI Agents for DevOps in 2026: Tools That Are Actually Worth Using

实测K8s工作流，区分真正好用的AI DevOps工具与营销噪音

I've been testing a bunch of AI tools in my Kubernetes workflow over the past few months and wanted to share what's genuinely changed my day-to-day vs…

ai agents devops kubernetes 工具评测 2026趋势

6

🤖 AI·大模型 Dev.to 2026-07-10

Versiona acciones de correo en agentes LLM

用版本号锁定邮件动作，让LLM agent不再自由发挥，提升可预测性与可靠性。

Muchos equipos que integran LLMs con correo se obsesionan con el prompt y dejan medio borroso el contrato de ejecución. En mi experiencia, el fallo re…

llm agents 版本控制邮件动作可控性动作定义

7

🤖 AI·大模型 Dev.to 2026-07-09

AI Agents aren't magic

AI agent并非魔法，其实质是LLM加编排、上下文和工具的循环执行

Today, I think all areas, especially IT, are becoming involved with AI. Right now, I think the term I hear most often is AI Agent. For non-technical p…

ai agents llm claude cod 工具循环

8

🤖 AI·大模型 arXiv AI 2026-07-07

CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and LLM Agents in Social Dilemmas

ICML 2026论文为LLM Agent在社交困境中的合作能力打造首个系统化基准，揭示维持机制的关键设计。

arXiv:2604.15267v2 Announce Type: replace-cross Abstract: It is increasingly important that LLM agents interact effectively and safely with other goal…

llm agents 社交困境合作机制基准测试 icml 2026

9

🤖 AI·大模型 arXiv AI 2026-07-07

SPORK: Self-Speculative Forking to Accelerate Agentic LLM Inference

自推测分支技术让LLM agent在等待工具返回时预生成后续推理，大幅减少GPU空闲时间，提升推理效率。

arXiv:2607.03333v1 Announce Type: cross Abstract: LLM agents are becoming a common interface for research, coding, and question answering, yet their T…

llm agents 推理加速推测执行 gpu利用率工具调用

10

🤖 AI·大模型 arXiv AI 2026-07-07

ToolFailBench: Diagnosing Tool-Use Failures in LLM Agents

新基准ToolFailBench精准诊断LLM agent工具调用失败原因，揭示聚合分数下的隐藏缺陷。

arXiv:2607.04686v1 Announce Type: cross Abstract: Tool calling is central to modern language model agents, but aggregate benchmark scores often hide w…

toolfailbe llm agents 工具调用失败诊断基准测试

11

📝 深度技术 arXiv AI 2026-07-07

No Time Like the Present: Agentic Test-Time Training for LLM Agents

提出Agentic Test-Time Training方法，让LLM智能体在长时任务中自适应调整权重，有效解决轨迹退化、策略失效难题。

arXiv:2607.03441v1 Announce Type: cross Abstract: LLM agents often degrade over long episodes: as trajectories grow, they revisit explored states, rep…

llm agents 测试时训练自适应学习长轨迹退化模型权重调整

12

🚀 产品观察 AWS Blog 2026-07-07

AWS Weekly Roundup: Claude Sonnet 5 on AWS, Amazon WorkSpaces for AI agents, AWS service availability updates, and more (July 6, 2026)

AWS周报亮点：Claude Sonnet 5在AWS上线，AI代理的WorkSpaces与Graviton5实例性能飙升25%

A couple of editions ago I wrote about what I find so energizing about working with startups. Last week I got a fresh dose of it: I spent a few days w…

claude son aws graviton5 ai agents amazon wor

13

🤖 AI·大模型 arXiv 机器学习 2026-07-07

Social Networks of LLM Agents

探究LLM智能体如何形成社交网络，揭示了AI模拟人类社交行为的新方向。

arXiv:2607.03695v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly deployed in interacting populations, raising the qu…

llm agents 社交网络多智能体 arxiv论文

14

📝 深度技术 Dev.to 2026-07-02

Don't generate your AGENTS.md with an LLM

不要让大模型替你生成项目文档，真实上下文才能保证任务成功率。

The counterintuitive finding Generating your AGENTS.md from a model feels efficient. The research disagrees: developer-written vs LLM-generated instru…

agents.md llm 项目上下文最佳实践任务成功率

15

🤖 AI·大模型 arXiv NLP 2026-07-01

Generative Skill Composition for LLM Agents

LLM智能体通过生成式技能组合提升复杂任务处理能力，模块化封装解锁更高自动化水平。

arXiv:2606.32025v1 Announce Type: new Abstract: Recent LLM agents benefit from skills for solving complex tasks. Skills encapsulate modular packages o…

llm agents 技能组合生成式组合模块化复杂任务

16

🤖 AI·大模型 Dev.to 2026-06-30

2026年版：FastAPIエージェントに渡すCLAUDE.md/AGENTS.mdの実例と書き方

手把手教你为FastAPI agent写CLAUDE.md规则文件，用可验证指令防止AI生成“看似对实则错”的代码。

なぜルールファイルがエージェントの品質を左右するか LLMベースのAIエージェントは、指示が曖昧なほど「それっぽいが間違った」コードを返す。FastAPIプロジェクトで顕著なのが、 async def の混在・依存注入の誤用・レスポンスモデルの省略といったパターンだ。ルールファイルはこれを防ぐ「コン…

年版実例 fastapi ai agent claude.md

17

🤖 AI·大模型 Dev.to 2026-06-30

AGENTS.md: The One File That Makes AI Coding Agents Actually Useful

一个简单文件让AI编码代理精准理解项目，提升协作效率的新利器。

If you’ve used Claude Code, Cursor, Codex, Aider, Gemini CLI, GitHub Copilot, Grok, goose, or similar tools, you’ve seen the same pattern: the agent’s…

ai编码代理 agents.md 项目配置开发工具提示工程

18

⚡ 效率工具 Dev.to 2026-06-29

2026年版：FastAPIエージェントに渡すルールファイル(CLAUDE.md/AGENTS.md)の実例と書き方

FastAPI项目中用规则文件统一AI代理代码风格，解决端点混乱与依赖注入不一致的实战指南。

なぜルールファイルがAIエージェントに効くのか Claude CodeやCursorなどのAIエージェントは、プロジェクト直下の CLAUDE.md / .cursorrules / AGENTS.md をシステムプロンプト的に読み込む。これにより、エージェントが生成するコードのスタイルや禁止事…

年版実例 fastapi claude.md agents.md

19

🔓 开源项目 Hacker News AI 2026-06-29

Ablo – The collaboration layer for AI agents

Ablo 是一个让人类和 AI 智能体在同一数据集上协同工作、避免冲突的开源协作层。

Article URL: https://github.com/Abloatai/ablo Comments URL: https://news.ycombinator.com/item?id=48711106 Points: 1 # Comments: 0

ai agents 协同层开源项目数据协作冲突解决

20

⚡ 效率工具 Dev.to 2026-06-27

DESIGN.md, CLAUDE.md, AGENTS.md: The Agent-Context File Family

三个核心文件为AI编码代理提供持久上下文，分别覆盖代码约定、视觉设计等，提升协作效率。

DESIGN.md, CLAUDE.md and AGENTS.md are plain-text, repo-resident files that give AI coding agents persistent context. CLAUDE.md and AGENTS.md cover co…

design.md claude.md agents.md ai编码代理上下文文件

🐂 牛哥精选

Authoring for Living Worlds: Tool-Constrained LLM Agents for Executable Multi-Actor Scenarios

AgentCheck: A Reproduce-Intervene-Mitigate Workbench for LLM Agents over MCP

Toward Auditable AI Scientists: A Hypothesis Evolution Protocol for LLM Agents

The LLMbda Calculus: AI Agents, Conversations, and Information Flow

AI Agents for DevOps in 2026: Tools That Are Actually Worth Using

Versiona acciones de correo en agentes LLM

AI Agents aren't magic

CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and LLM Agents in Social Dilemmas

SPORK: Self-Speculative Forking to Accelerate Agentic LLM Inference

ToolFailBench: Diagnosing Tool-Use Failures in LLM Agents

No Time Like the Present: Agentic Test-Time Training for LLM Agents

AWS Weekly Roundup: Claude Sonnet 5 on AWS, Amazon WorkSpaces for AI agents, AWS service availability updates, and more (July 6, 2026)

Social Networks of LLM Agents

Don't generate your AGENTS.md with an LLM

Generative Skill Composition for LLM Agents

2026年版：FastAPIエージェントに渡すCLAUDE.md/AGENTS.mdの実例と書き方

AGENTS.md: The One File That Makes AI Coding Agents Actually Useful

2026年版：FastAPIエージェントに渡すルールファイル(CLAUDE.md/AGENTS.md)の実例と書き方

Ablo – The collaboration layer for AI agents

DESIGN.md, CLAUDE.md, AGENTS.md: The Agent-Context File Family

📅 日期