牛哥精选 · 本月

1

📝 深度技术 arXiv NLP 2026-06-12

One Token to Fool LLM-as-a-Judge

只需一个token就能轻松骗过LLM评判者，揭示AI评估体系的安全软肋。

arXiv:2507.08794v3 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly trusted as automated judges, assisting evaluat…

ai安全 llm漏洞自动评判单token攻击模型可信度

2

🤖 AI·大模型 arXiv AI 2026-06-12

PaLMR: Towards Faithful Visual Reasoning via Multimodal Process Alignment

PaLMR通过多模态过程对齐实现可信视觉推理，提升大模型对图像的理解与逻辑一致性。

arXiv:2603.06652v2 Announce Type: replace-cross Abstract: Reinforcement learning has recently improved the reasoning ability of Large Language Models …

视觉推理多模态对齐可信ai 大语言模型过程监督

3

🤖 AI·大模型 arXiv AI 2026-06-11

A Survey on Evaluating Quality and Trustworthiness in LLM-Generated Data

全面综述LLM生成数据的质量与可信度评估方法，涵盖框架、指标与挑战，是相关研究者的必读参考。

arXiv:2601.17717v3 Announce Type: replace Abstract: Large Language Models (LLMs) have emerged as powerful tools for generating data across various mod…

llm生成数据质量评估可信度数据质量综述

4

📝 深度技术 arXiv AI 2026-06-11

"That's AI Slop, You Bot!" Studying Accusations, Evidence, and Credibility in Online Discourse Towards LLM-Generated Comments

分析了2500万条评论，揭示读者如何识别和指责AI生成内容，以及反AI态度的演变

arXiv:2606.12073v1 Announce Type: cross Abstract: Generative AI has made fluent prose cheap to produce, breaking the old promise to readers that good …

llm生成文本在线话语反ai态度可信度社交媒体研究

5

🔗 链接工具 Hacker News AI 2026-06-11

LinkedIn's AI content boom is creating a credibility problem for founders

抱歉，您只提供了一个文章链接和摘要，没有指明具体的在线工具。请提供工具名称或网站链接，我才能为您撰写推荐语、分类、标签和评分。

Article URL: https://www.inc.com/netta-jenkins/linkedin-is-being-flooded-with-ai-content-and-a-new-problem-is-emerging/91354856 Comments URL: https://…

ai检测内容真实性写作辅助防抄袭可信度评估

6

🚀 产品观察 OpenAI 官方博客 2026-06-10

From data to decisions: how LSEG is scaling trusted AI

LSEG携手OpenAI打造可信AI，撬动全球业务数据决策，释放4000员工效能，看点十足。

See how LSEG uses OpenAI to scale trusted AI across its global business, accelerating insights, shrinking release cycles, and empowering 4,000 employe…

lseg openai 可信ai 企业ai 数据决策

7

🤖 AI·大模型 arXiv AI 2026-06-10

Mitigating Manifold Departure: Uncertainty-Aware Subspace Rectification for Trustworthy MLLM Decoding

ICML 2026新研究用不确定性感知子空间纠正，让多模态大模型解码更可信，有效缓解流形偏离问题。

arXiv:2606.09859v1 Announce Type: cross Abstract: MLLMs frequently hallucinate objects inconsistent with visual inputs. This issue is typically attrib…

多模态大模型不确定性量化流形学习可信解码子空间修正

8

🚀 产品观察 IT 之家 2026-06-10

IDC 分析师：WWDC26 是苹果 AI 的“可信度”测试

苹果WWDC26发布iOS 27及Siri AI，分析师称这是苹果AI战略的可信度测试，库克时代或将落幕。

IT之家 6 月 9 日消息，彭博社记者马克 · 古尔曼 9 日（今天）撰文称，苹果在全球开发者大会上为新一代产品铺路，试图向外界证明，在经历多年 AI 功能延期和落地不顺之后，自己仍有能力在 AI 时代参与竞争。苹果最新一代操作系统的核心，是经过全面改造的 Siri AI ，覆盖 iOS 27、…

分析师是苹果可信度测试苹果

9

📝 深度技术 arXiv NLP 2026-06-09

CARE: A Conformal Safety Layer for Medical Summarization

针对医疗摘要提出CARE符合性安全层，提升生成内容的安全性与可信度。

arXiv:2606.08969v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used for medical summarization, but their outputs can om…

医疗ai 符合性预测安全层医学摘要大语言模型

10

📝 深度技术 arXiv AI 2026-06-05

Breaking the Chain: A Causal Analysis of LLM Faithfulness to Intermediate Structures

用因果分析揭示大模型对中间推理结构的忠实度，为提升AI可信度提供新视角。

arXiv:2603.16475v2 Announce Type: replace Abstract: In schema-guided reasoning (SGR) pipelines, LLMs produce explicit intermediate structures -- rubri…

llm 因果关系忠实性中间结构可信度

11

🤖 AI·大模型 arXiv AI 2026-06-03

CauTion: Knowing When to Trust LLMs for Ensemble Causal Discovery

大模型做因果发现时何时可信？这篇论文提出CauTion框架，动态评估LLM的集成信任度，提升因果推断鲁棒性。

arXiv:2606.03602v1 Announce Type: cross Abstract: Causal discovery from observational data remains challenging due to the fundamental limitations of p…

llm 因果发现可信度集成学习人工智能安全

12

🔓 开源项目 arXiv AI 2026-06-03

OpenAgenet/OAN: Open Infrastructure for Trusted Agent Interconnection

为AI代理提供开放、可信的互连基础设施，推动多智能体系统协作落地

arXiv:2606.03161v1 Announce Type: cross Abstract: OpenAgenet, abbreviated as OAN, is an open infrastructure project for trusted Agent interconnection.…

openagenet oan 代理互连可信基础设施开放基础设施

13

🤖 AI·大模型 arXiv 机器学习 2026-06-02

Uncertainty-Calibrated Diffusion for Reliable 3D Molecular Graph Generation

新方法通过不确定性校准提升3D分子图生成的可靠性，有望推动药物发现与材料设计。

arXiv:2606.01595v1 Announce Type: new Abstract: Bayesian inference provides a principled framework for modeling epistemic uncertainty in neural networ…

不确定性校准扩散模型 3d分子图生成药物发现可信ai

14

🤖 AI·大模型 arXiv AI 2026-06-02

Doing What They Say, Not What They Reason: Locating the Faithfulness Gap in LLM Agents

最新研究揭示LLM智能体「说一套做一套」的忠实性差距根源，对构建可信AI系统有重要启示。

arXiv:2606.00476v1 Announce Type: new Abstract: Do LLM agents act on the reasoning they state? This question of process fidelity is central to using L…

llm agent 忠实性差距推理与行为 ai可信度最新研究

15

📝 深度技术 arXiv 机器学习 2026-06-02

FLaG: Fine-Grained Latent Grouping for Hallucination Detection

FLaG提出细粒度潜在分组方法，精准检测大模型幻觉，为LLM可信性研究提供新思路。

arXiv:2606.00301v1 Announce Type: new Abstract: Hallucinations in large language models (LLMs) arise from heterogeneous failure mechanisms, making rel…

幻觉检测细粒度潜在分组大模型 ai安全论文

16

🔓 开源项目 arXiv AI 2026-06-01

dashi: A Python library for Dataset Shift Characterization to Support Trustworthy AI Development and Deployment

开源Python库「dashi」专攻数据集偏移表征，助力构建更可信的AI系统

arXiv:2605.31360v1 Announce Type: cross Abstract: The Artificial Intelligence (AI) life cycle requires a thorough understanding of the underlying data…

数据集偏移可信ai python库机器学习数据质量

17

📝 深度技术 Dev.to 2026-05-30

Building a Self-Healing AI Agent: How to Run Untrusted Code Safely Without Blowing Up Your Server

手把手教你构建自愈AI代理，安全执行不受信任代码，不炸服务器！

Imagine you are building an autonomous AI agent. You give it a terminal tool, a file-writing tool, and the ability to execute Python scripts. You ask …

ai agent 安全执行不可信代码自愈机制工具定义层

18

🤖 AI·大模型 Ars Technica 2026-05-29

LLMs believe false statements even after explicit warnings that they're false

研究揭示LLM在训练数据中植入错误信念后，即使明确警告也无法纠正，警示AI安全与事实性漏洞。

Fine-tuning tests show "bias ... toward confidently representing the claims as true."

llm 错误信念训练数据 ai安全可信度

19

📝 深度技术 IT 之家 2026-05-28

闪迪 SATA 固态硬盘新品 SANDISK 520 / 320 现身电商，基于“可信 NAND”

IT之家 5 月 28 日消息，参考数据挖掘者 188号 (@momomo_us) 的发现，Sandisk（闪迪）两款 toC SATA 固态硬盘新品出现在了海外电商平台上，型号分别为 SANDISK 520 和 SANDISK 320。这两款 SATA SSD 均采用 2.5" 7mm 盘体， …

闪迪固态硬盘新品现身电商基于可信

20

📝 深度技术 arXiv 机器学习 2026-05-27

Benchmark Leakage Trap: Can We Trust LLM-based Recommendation?

研究发现大模型推荐系统存在严重的基准泄漏问题，提醒你审慎看待其实际表现。

arXiv:2602.13626v3 Announce Type: replace Abstract: The expanding integration of Large Language Models (LLMs) into recommender systems poses critical …

llm 推荐系统基准泄漏可信度评估陷阱

🐂 牛哥精选