One Token to Fool LLM-as-a-Judge
只需一个token就能轻松骗过LLM评判者,揭示AI评估体系的安全软肋。
arXiv:2507.08794v3 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly trusted as automated judges, assisting evaluat…
只需一个token就能轻松骗过LLM评判者,揭示AI评估体系的安全软肋。
arXiv:2507.08794v3 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly trusted as automated judges, assisting evaluat…
PaLMR通过多模态过程对齐实现可信视觉推理,提升大模型对图像的理解与逻辑一致性。
arXiv:2603.06652v2 Announce Type: replace-cross Abstract: Reinforcement learning has recently improved the reasoning ability of Large Language Models …
全面综述LLM生成数据的质量与可信度评估方法,涵盖框架、指标与挑战,是相关研究者的必读参考。
arXiv:2601.17717v3 Announce Type: replace Abstract: Large Language Models (LLMs) have emerged as powerful tools for generating data across various mod…
分析了2500万条评论,揭示读者如何识别和指责AI生成内容,以及反AI态度的演变
arXiv:2606.12073v1 Announce Type: cross Abstract: Generative AI has made fluent prose cheap to produce, breaking the old promise to readers that good …
抱歉,您只提供了一个文章链接和摘要,没有指明具体的在线工具。请提供工具名称或网站链接,我才能为您撰写推荐语、分类、标签和评分。
Article URL: https://www.inc.com/netta-jenkins/linkedin-is-being-flooded-with-ai-content-and-a-new-problem-is-emerging/91354856 Comments URL: https://…
LSEG携手OpenAI打造可信AI,撬动全球业务数据决策,释放4000员工效能,看点十足。
See how LSEG uses OpenAI to scale trusted AI across its global business, accelerating insights, shrinking release cycles, and empowering 4,000 employe…
ICML 2026新研究用不确定性感知子空间纠正,让多模态大模型解码更可信,有效缓解流形偏离问题。
arXiv:2606.09859v1 Announce Type: cross Abstract: MLLMs frequently hallucinate objects inconsistent with visual inputs. This issue is typically attrib…
苹果WWDC26发布iOS 27及Siri AI,分析师称这是苹果AI战略的可信度测试,库克时代或将落幕。
IT之家 6 月 9 日消息,彭博社记者马克 · 古尔曼 9 日(今天)撰文称,苹果在全球开发者大会上为新一代产品铺路,试图向外界证明,在经历多年 AI 功能延期和落地不顺之后,自己仍有能力在 AI 时代参与竞争。 苹果最新一代操作系统的核心,是经过全面改造的 Siri AI ,覆盖 iOS 27、…
针对医疗摘要提出CARE符合性安全层,提升生成内容的安全性与可信度。
arXiv:2606.08969v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used for medical summarization, but their outputs can om…
用因果分析揭示大模型对中间推理结构的忠实度,为提升AI可信度提供新视角。
arXiv:2603.16475v2 Announce Type: replace Abstract: In schema-guided reasoning (SGR) pipelines, LLMs produce explicit intermediate structures -- rubri…
大模型做因果发现时何时可信?这篇论文提出CauTion框架,动态评估LLM的集成信任度,提升因果推断鲁棒性。
arXiv:2606.03602v1 Announce Type: cross Abstract: Causal discovery from observational data remains challenging due to the fundamental limitations of p…
为AI代理提供开放、可信的互连基础设施,推动多智能体系统协作落地
arXiv:2606.03161v1 Announce Type: cross Abstract: OpenAgenet, abbreviated as OAN, is an open infrastructure project for trusted Agent interconnection.…
新方法通过不确定性校准提升3D分子图生成的可靠性,有望推动药物发现与材料设计。
arXiv:2606.01595v1 Announce Type: new Abstract: Bayesian inference provides a principled framework for modeling epistemic uncertainty in neural networ…
最新研究揭示LLM智能体「说一套做一套」的忠实性差距根源,对构建可信AI系统有重要启示。
arXiv:2606.00476v1 Announce Type: new Abstract: Do LLM agents act on the reasoning they state? This question of process fidelity is central to using L…
FLaG提出细粒度潜在分组方法,精准检测大模型幻觉,为LLM可信性研究提供新思路。
arXiv:2606.00301v1 Announce Type: new Abstract: Hallucinations in large language models (LLMs) arise from heterogeneous failure mechanisms, making rel…
开源Python库「dashi」专攻数据集偏移表征,助力构建更可信的AI系统
arXiv:2605.31360v1 Announce Type: cross Abstract: The Artificial Intelligence (AI) life cycle requires a thorough understanding of the underlying data…
手把手教你构建自愈AI代理,安全执行不受信任代码,不炸服务器!
Imagine you are building an autonomous AI agent. You give it a terminal tool, a file-writing tool, and the ability to execute Python scripts. You ask …
研究揭示LLM在训练数据中植入错误信念后,即使明确警告也无法纠正,警示AI安全与事实性漏洞。
Fine-tuning tests show "bias ... toward confidently representing the claims as true."
IT之家 5 月 28 日消息,参考数据挖掘者 188号 (@momomo_us) 的发现,Sandisk(闪迪)两款 toC SATA 固态硬盘新品出现在了海外电商平台上,型号分别为 SANDISK 520 和 SANDISK 320。 这两款 SATA SSD 均采用 2.5" 7mm 盘体, …
研究发现大模型推荐系统存在严重的基准泄漏问题,提醒你审慎看待其实际表现。
arXiv:2602.13626v3 Announce Type: replace Abstract: The expanding integration of Large Language Models (LLMs) into recommender systems poses critical …