牛哥精选 · 本月

1

📝 深度技术 arXiv NLP 2026-06-12

One Token to Fool LLM-as-a-Judge

只需一个token就能轻松骗过LLM评判者，揭示AI评估体系的安全软肋。

arXiv:2507.08794v3 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly trusted as automated judges, assisting evaluat…

ai安全 llm漏洞自动评判单token攻击模型可信度

2

📝 深度技术 arXiv AI 2026-06-12

ReSum: Synergizing LLM Reasoning and Summarization with Reinforcement Learning

新论文ReSum用强化学习协同LLM推理与摘要，解决长推理链低效问题，干货满满。

arXiv:2606.13316v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) is a central technique for improving long-horizo…

强化学习 llm推理摘要 rlvr 协同优化

3

📝 深度技术 arXiv AI 2026-06-11

Certifiable Safe RLHF: Semantic Grounding and Fixed Penalty Constraint Optimization for Safer LLM Alignment

语义基础+固定惩罚约束优化，让大模型对齐过程获得可认证的安全保障

arXiv:2510.03520v2 Announce Type: replace-cross Abstract: Ensuring safety is a foundational requirement for large language models (LLMs). Achieving an…

safe rlhf llm对齐语义基础约束优化安全性

4

🤖 AI·大模型 arXiv AI 2026-06-10

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

用强化学习让大模型更诚实，TruthRL方法提升LLM回答真实性，含代码开源

arXiv:2509.25760v2 Announce Type: replace-cross Abstract: While large language models (LLMs) have demonstrated strong performance on factoid question …

truthrl 强化学习 llm 诚实性幻觉

5

📝 深度技术 arXiv AI 2026-06-10

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

非均匀令牌级信任区域优化，突破传统限制提升大模型强化学习训练稳定性。

arXiv:2606.10968v1 Announce Type: cross Abstract: Reinforcement learning with verifiable rewards (RLVR) has become standard for improving LLM reasonin…

llm 强化学习信任区域令牌级优化 rlhf

6

🤖 AI·大模型 arXiv AI 2026-06-10

TD-Grokking: Learning from Zero-Reward Problems by Training-Time Decomposition

LLM推理训练新突破：通过训练时分解攻克零奖励难题，让模型从失败轨迹中学习！

arXiv:2606.09883v1 Announce Type: cross Abstract: Large language models (LLMs) have made remarkable progress in reasoning tasks, largely driven by pos…

大语言模型推理强化学习零奖励问题训练时间分解

7

📝 深度技术 arXiv AI 2026-06-09

ComplexConstraints and Beyond: Expert Rubrics for RLVR

提出专家评分标准解决RLVR中复杂约束问题，为强化学习奖励设计提供新范式

arXiv:2606.09118v1 Announce Type: new Abstract: As LLM capabilities advance rapidly, the evaluation methods used to assess them increasingly lag behin…

rlvr 专家评分标准复杂约束强化学习奖励设计

8

📝 深度技术 arXiv 机器学习 2026-06-09

CATPO: Critique-Augmented Tree Policy Optimization

CATPO方法通过批评增强的树策略优化，显著提升大语言模型推理中的密集奖励获取效率。

arXiv:2606.08346v1 Announce Type: cross Abstract: Reinforcement learning with verifiable rewards (RLVR) has become a dominant paradigm for improving t…

强化学习大语言模型推理能力树策略优化批判增强

9

💰 商业科技 IT 之家 2026-06-09

OpenAI 申请上市，奥尔特曼估值 25 亿美元的眼球扫描公司却被曝裁员

OpenAI冲刺上市，奥尔特曼旗下估值25亿美元的眼球扫描公司却陷入裁员与各国监管围堵。

IT之家 6 月 9 日消息，OpenAI 于当地时间周一宣布已秘密提交首次公开募股（IPO）申请，这或将成为近十年最具标志性的上市事件之一。另据 Business Insider 报道，OpenAI 首席执行官山姆 · 奥尔特曼旗下的另一家公司 Tools for Humanity 正进行裁员。 …

申请上市奥尔特曼估值亿美元的眼球扫描公司却被曝裁员

10

🚀 产品观察 TechCrunch 2026-06-09

As OpenAI files for IPO, Sam Altman’s eye-scanning company is doing layoffs, report says

OpenAI冲刺IPO之际，Sam Altman的虹膜扫描公司World却因监管与商业困境进行裁员，反差揭示科技巨头生态的复杂性。

Tools for Humanity, Sam Altman's identity verification company, is reportedly struggling to generate revenue and will downsize its staff.

openai ipo sam altman worldcoin 裁员

11

📝 深度技术 arXiv NLP 2026-06-08

What Do People Actually Want From AI? Mapping Preference Plurality

顶会论文揭示RLHF聚合偏好的根本缺陷，系统绘制人类对AI的真实多元需求图谱

arXiv:2606.06674v1 Announce Type: new Abstract: Large Language Models (LLMs) are often fine-tuned through Reinforcement Learning from Human Feedback (…

大语言模型，rlhf 多元性 facct2026

12

🔧 开发工具 IT 之家 2026-06-07

微软让步，撤回对白帽黑客“梦魇日蚀”的法律威胁

多引擎恶意文件检测平台，支持URL和文件扫描，社区共享威胁情报，助力白帽安全研究

IT之家 6 月 7 日消息，据科技媒体 Notebookcheck 今天报道，在全球网络安全行业人士的强烈反对之下，微软已正式收回此前针对白帽黑客“梦魇日蚀（Nightmare Eclipse）”的强硬法律威胁。据报道，“梦魇日蚀”曾在此前绕过微软的传统漏洞提交流程，直接公开了多个 Window…

微软让步撤回对白帽黑梦魇日蚀的法律威胁病毒扫描

13

🔓 开源项目 Hacker News Show 2026-06-07

Show HN: Lathe – Use LLMs to learn a new domain, not skip past it

用LLM重构学习方式：Lathe让你深入新领域，而非走捷径，每个教程都透明记录来源、模型和提示。

Hey HN! Lathe is an experiment in using LLMs to teach me something new, instead of doing the work for me. It generates a hands-on, source-backed tutor…

lathe llm 领域学习开源工具教程

14

🤖 AI 工具 Hacker News AI 2026-06-06

Show HN: Summarize YT Video by pasting url into AI chat[video]

粘贴YouTube视频链接到AI聊天框，即可自动生成视频摘要，无需手动观看，集成在AI工具中操作更便捷。

We added tooling to our chat to make it agentic. It can control our 40+ apps suite. One of the tools is url fetching with pagination. Comments URL: ht…

视频摘要 ai聊天 url粘贴自动摘要 youtube

15

🤖 AI·大模型 Hacker News LLM 2026-06-05

Show HN: Clarity, See what concepts your LLM uses and trace it to training data

首个可解释AI平台Clarity，让你看到大模型使用了哪些概念并能追溯至训练数据。

Article URL: https://www.guidelabs.ai/post/meet-clarity/ Comments URL: https://news.ycombinator.com/item?id=48401606 Points: 3 # Comments: 1

可解释ai llm 训练数据追溯 steerling ai透明度

16

📝 深度技术 arXiv AI 2026-06-04

A Systematic Investigation of RL-Jailbreaking in LLMs

系统研究强化学习对LLM的越狱攻击，揭示AI安全新风险，值得关注

arXiv:2605.07032v2 Announce Type: replace-cross Abstract: The evolution of generative models from next-token predictors to autonomous engines of compl…

rl-jailbre 大模型安全对抗攻击强化学习 llm漏洞

17

📝 深度技术 arXiv AI 2026-06-04

AgentJet: A Flexible Swarm Training Framework for Agentic Reinforcement Learning

一种灵活的群体训练框架，让智能体强化学习更高效协同。

arXiv:2606.04484v1 Announce Type: new Abstract: We present AgentJet, a distributed swarm training framework for large language model (LLM) agent reinf…

agentjet 群体训练强化学习框架智能体

18

📝 深度技术 arXiv AI 2026-06-03

Libra: Efficient Resource Management for Agentic RL Post-Training

高效管理Agentic RL后训练资源的新方案Libra，降低训练成本、提升性能。

arXiv:2606.03077v1 Announce Type: cross Abstract: Reinforcement learning (RL) has become a standard post-training paradigm for large language models (…

libra 资源管理 agentic rl 后训练强化学习

19

🔗 链接工具 arXiv AI 2026-06-03

AUGUSTE: Online-Learning dApp for Predictive URLLC Scheduling

AUGUSTE是面向5G URLLC的在线学习dApp，用AI预测调度实现1毫秒级超可靠低延迟通信

arXiv:2606.03664v1 Announce Type: cross Abstract: Ultra Reliable and Low Latency Communications (URLLC) was one of the main motivations behind 5G, wit…

在线学习 urllc调度 5g网络预测算法 dapp

20

🔧 开发工具 Dev.to 2026-06-02

URL Encoding Explained: Special Characters and How to Handle Them

快速将URL中的特殊字符进行编码或解码，确保链接安全无错误，支持多种编码格式

URL Encoding Explained: Special Characters and How to Handle Them 📅 May 25, 2026⏱️ 7 min read🔗 Network Tools Every character in a URL has a meaning. S…

url编码 url解码特殊字符链接安全网络工具

🐂 牛哥精选