Trust or Abstain? A Self-Aware RAG Approach
提出自我感知RAG方法,解决检索知识与参数知识冲突时的信任与弃权决策问题。
arXiv:2605.18792v1 Announce Type: cross Abstract: Retrieval-augmented generation (RAG) improves large language models (LLMs) by incorporating external…
提出自我感知RAG方法,解决检索知识与参数知识冲突时的信任与弃权决策问题。
arXiv:2605.18792v1 Announce Type: cross Abstract: Retrieval-augmented generation (RAG) improves large language models (LLMs) by incorporating external…
LLM管线可信边界验证框架,用证明携带证书确保确定性结构,安全架构新思路。
arXiv:2605.16407v1 Announce Type: cross Abstract: We present a framework for verifying the deterministic structured computations surrounding a large l…
提出Delta Forcing方法,解决交互式自回归视频生成中响应性与稳定性的平衡难题。
arXiv:2605.14382v2 Announce Type: replace Abstract: Interactive real-time autoregressive video generation is essential for applications such as conten…
亿格云获数亿元B轮融资,主攻“人+AI”统一安全治理新赛道,连续三年翻倍增长。
Google Gemini跨应用数据推理能力引发隐私信任新挑战,揭示AI未来与个人数据的深度绑定。
Google has big promises for its AI-powered future - and a lot of it depends on your trust. At I/O 2026, Google described a bunch of new tools that it …
LLM能否生成考虑性别的多模态行为来校准用户对社交代理的信任?这篇研究切入了一个关键的人机交互问题。
arXiv:2605.19798v1 Announce Type: new Abstract: As Socially Interactive Agents (SIAs) become increasingly integrated into daily life, the ability to c…
对原生浏览器特性信心十足,第三方代码却一文不值,具体百分比告诉你为什么。
Jeremy doesn’t trust third-party code , but... I’m much more trusting of native browser features—HTML elements, CSS features, and JavaScript APIs. The…
一场疫情揭示的不仅是身体脆弱,更是社会信任的裂缝——诚实与隔离才是解药。
It wasn’t just attacking our bodies. Instead, the pandemic had found a weakness in the unbreakable social bonds that we share with one another. Our ne…
OpenAI官方指南:企业如何从实验到规模化部署AI,聚焦信任、治理与工作流设计
How enterprises scale AI: from early experiments to compounding impact through trust, governance, workflow design, and quality at scale.
揭秘LLM如何生成说服性解释,操纵人类对AI辅助决策的信任,一场新型对抗攻击。
arXiv:2602.04003v3 Announce Type: replace Abstract: Most adversarial threats in artificial intelligence (AI) target the computational behavior of mode…
ChatGPT推出“信任联系人”功能,当检测到自残等危险对话时,可主动通知你指定的人,为心理健康提供安全网
You can invite a friend or family member to be your ChatGPT "Trusted Contact."
提出Agent技能作为可验证工件,用双条件正确性标准解决人机协作信任问题,LLM部署的新范式
arXiv:2605.00424v2 Announce Type: replace-cross Abstract: Agent skills - structured packages of instructions, scripts, and references that augment a l…
医生信赖的医疗AI如何通过Vercel实现零妥协扩展,从TikTok爆红到系统稳如磐石。
Andy Yoon was scrolling through Slack when he saw the message: OpenEvidence had gone viral on TikTok. Not "gaining traction.” Actually viral, reaching…
LLM作为评判者会偏向自己,这篇论文量化了自我偏好偏差并提出了缓解方法。
arXiv:2604.22891v3 Announce Type: replace-cross Abstract: LLM-as-a-Judge has become a dominant approach in automated evaluation systems, playing criti…
OpenAI推出信任访问框架,平衡前沿网络安全能力与反滥用保护。
OpenAI introduces Trusted Access for Cyber, a trust-based framework that expands access to frontier cyber capabilities while strengthening safeguards …
代理式AI的透明性设计:在“黑箱”与“数据倾倒”之间找到平衡,让用户信任AI决策。
Designing for agentic AI requires attention to both the system’s behavior and the transparency of its actions. Between the black box and the data dump…
针对多智能体LLM协调中序列微调导致上下文分布偏移的缺陷,提出信任区域微调方法,有效提升团队协同表现。
arXiv:2605.15207v1 Announce Type: new Abstract: Multi-agent LLM systems have shown promise for complex reasoning, yet recent evaluations reveal they o…
AI编码工具可能正在泄露你的敏感数据,你究竟了解多少?
Every session my AI coding agent reads files, runs commands, makes API calls. I have no idea exactly what ends up in the cloud. Is anyone actually tra…
一位技术人决绝地宣布:在所有人类交流中禁用AI,只为守护那份脆弱的人类信任。
Article URL: https://sam.elborai.me/articles/no-more-llm-comms/ Comments URL: https://news.ycombinator.com/item?id=48181804 Points: 1 # Comments: 0
LLM推断的用户状态能信吗?本文提出心理测量框架验证其可靠性。
arXiv:2605.15734v1 Announce Type: new Abstract: The use of large language models to assess user states in conversational and adaptive systems is based…