Show HN: Rubric – test what your LLM agent did, not just what it said
LLM agent 行为测试新工具,验证做了什么而非说了什么,开源且实用
Article URL: https://github.com/Kareem-Rashed/rubric-eval Comments URL: https://news.ycombinator.com/item?id=48509073 Points: 1 # Comments: 0
LLM agent 行为测试新工具,验证做了什么而非说了什么,开源且实用
Article URL: https://github.com/Kareem-Rashed/rubric-eval Comments URL: https://news.ycombinator.com/item?id=48509073 Points: 1 # Comments: 0
激活引导何时生效?一篇论文揭示LLM行为控制的边界与条件,帮你省去盲目网格搜索的功夫。
arXiv:2606.11599v1 Announce Type: cross Abstract: Activation steering offers a lightweight approach to control language models' behavior at inference …
研究揭示大语言模型在道德推理上的不足,为安全AI发展敲响警钟。
arXiv:2606.11635v1 Announce Type: cross Abstract: For highly capable AI systems to operate safely in dynamic, open-ended environments, they must be ab…
揭示LLM决策背后的真相:它们真的在推理还是仅仅模仿理由?这篇新研究深入探讨AI的潜意识。
arXiv:2606.11016v1 Announce Type: new Abstract: We ask whether large language models (LLMs) merely imitate rationales when choosing between two option…
从界面设计入手,探讨如何通过UI干预引导用户更可持续地使用LLM聊天机器人,跳出传统模型优化思路。
arXiv:2606.10861v1 Announce Type: cross Abstract: LLM-powered chatbots are increasingly embedded in everyday workflows, raising sustainability concern…
利用大模型驱动行为并加入运动学约束,生成逼真的移动异常场景
arXiv:2606.10314v1 Announce Type: new Abstract: Although the study of human trajectory anomalies is critical for advancing spatial data mining, empiri…
跨语言大模型代理中收益缩放如何塑造合作行为,揭示博弈论新视角
arXiv:2601.19082v2 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly deployed as autonomous agents that negotiate, …
从表示层揭示AI安全评估的盲点,为模型对齐提供全新视角与洞见
arXiv:2606.08044v1 Announce Type: new Abstract: Large Language Model (LLM) safety has often been evaluated at the behavior level, which provides limit…
行为树任务编排利器,让LLM代理的任务调用更模块化、可组合
Article URL: https://github.com/orion-arm-ai/tinytasktree Comments URL: https://news.ycombinator.com/item?id=48443382 Points: 1 # Comments: 0
传统作者识别依赖长文本,而LLM交互中的短促提示是否也蕴含独特的「笔迹」?这项研究提出了PromptPrint,用行为生物特征识别你。
arXiv:2606.06755v1 Announce Type: new Abstract: Authorship attribution research has traditionally focused on long-form, expressive texts; however, int…
IT之家 6 月 8 日消息,中国人民银行浙江省分行 6 月 8 日披露的行政处罚决定信息显示,因违反账户管理规定、违反清算管理规定、违反数据安全管理规定、未按照规定开展客户尽职调查, 网易支付(杭州)有限公司被警告,并处 220.4 万元罚款 。 该公司技术中心余某,对违反数据安全管理规定负有责任…
首个衡量多模态大模型理解界面引导用户行为的基准,来自ACL 2026,揭示MLLMs在UI/UX设计认知中的能力与局限。
arXiv:2505.05026v5 Announce Type: replace Abstract: User interface (UI) design goes beyond visuals to shape user experience (UX), underscoring the shi…
自动追踪你在电脑和手机上的活动,用数据帮你找回丢失的注意力,从此告别失控的碎片化浏览
This week I’ve been at SXSW London. There’s been music, film, and a lot—and I mean a lot—of talk about AI. I also had the opportunity to sit down with…
企业正利用Reddit操控ChatGPT和Google AI搜索结果,导致内容质量严重下滑。
Article URL: https://www.404media.co/companies-are-using-reddit-to-manipulate-chatgpt-and-google-ai-search/ Comments URL: https://news.ycombinator.com…
研究发现LLM能轻易纠正别人错误,却对自己推理错误“睁一只眼闭一只眼”,揭示自我纠正的认知错觉。
arXiv:2606.05976v1 Announce Type: new Abstract: Recent work shows that LLM agents struggle to correct errors in their own reasoning traces yet show ma…
发现新机制:上下文环境会悄无声息地缩短大模型的推理链条,揭示LLM行为的内在规律。
arXiv:2604.01161v2 Announce Type: replace Abstract: Large language models (LLMs) exhibiting test-time scaling behavior, such as extended reasoning tra…
拖延不是懒,是神经系统在捣鬼——理解背后的科学,比整理桌面更有效。
What Stage Fright Can Teach Developers About Procrastination You've built the project before. You know the tools. The readme is written. Everything is…
36氪获悉,深交所公告,2026年6月1日至6月5日,本所共对199起证券异常交易行为采取了自律监管措施,涉及盘中拉抬打压、虚假申报等异常交易情形;共对3起上市公司重大事项进行核查,并上报证监会4起涉嫌违法违规案件线索。
系统揭示开源大模型在不同伦理领域的安全行为差异,直指透明度缺口与合规不可预测性
arXiv:2606.04035v1 Announce Type: cross Abstract: We present a systematic study of domain-dependent safety behavior in open-weight LLMs: 7 standardize…
IT之家 6 月 4 日消息,6 月 3 日,小鹏集团董事长、CEO 何小鹏在直播中表示,针对小鹏 GX 热销导致提车周期延长,网上流传“加价 2 万能插队提车”说法是谣言,公司政策角度是 绝对禁止 的。他强调小鹏不会采取加价插队提车的做法,小鹏 GX 的交付顺序严格按照用户下单的顺序。 IT之家注…