牛哥精选 · 本月

📋 全部 🤖 AI·大模型 ⚡ 效率工具 📝 深度技术 🚀 产品观察 💰 商业科技 🔓 开源项目 🎨 设计创意 📖 阅读推荐 🏷 资源合集 🌱 成长效率

🤖 AI·大模型 arXiv AI 2026-06-11

On the Limits of LLM-as-Judge for Scientific Novelty Assessment

揭秘LLM在科学新颖性评估中的致命短板，颠覆对AI判别的盲目信任

arXiv:2606.12071v1 Announce Type: cross Abstract: LLMs are increasingly used to generate and judge scientific ideas. This makes novelty evaluation a c…

llm评估科学新颖性局限性论文评审 ai判别

🤖 AI·大模型 IT 之家 2026-06-07

体育主播暂时不用担心失业了：研究称 AI 模型分析球赛“几乎靠猜”

最新研究揭示：AI分析体育比赛表现远逊人类，准确率几乎靠猜，体育主播饭碗暂时安全

IT之家 6 月 6 日消息，据外媒 Futurism 今天（6 日）晚间报道，北卡罗来纳大学教堂山分校和美国东北大学研究人员的一项新研究发现，主流 AI 模型在分析职业体育比赛时表现很差。这项研究目标是考察热门 AI 模型在感知、推理、模拟和自主行动能力四个方面的表现，现有测试方法很难准确评估…

体育主播暂时不用担心失业研究称模型分析球赛几乎靠猜

🤖 AI·大模型 Hacker News AI 2026-06-05

LLM AI Chatbots are letting me down every single day

为什么LLM聊天机器人每天让人失望？作者亲历吐槽AI的局限与缺陷。

Article URL: https://umrashrf.github.io/llm-ai-chatbots-are-letting-me-down-every-single-day/ Comments URL: https://news.ycombinator.com/item?id=48406…

llm ai聊天机器人失望局限性用户体验

💰 商业科技 TechCrunch 2026-06-02

Rocket engine startup Impulse raises $500 million to hire people, not AI

火箭引擎初创公司Impulse获5亿美元融资，却坚持用真人而非AI搞研发，对技术模拟的清醒认知值得深思。

Engineering physical systems still depends on human talent, according to Impulse Space president Eric Romo.

火箭发动机融资人力资源模拟航天

🤖 AI·大模型 arXiv AI 2026-05-27

LLMs versus the Halting Problem: Characterizing Program Termination Reasoning

用停机问题检验LLM的极限：这篇论文通过理论分析和实验对比，揭示了大语言模型在程序终止性推理上的能力与局限

arXiv:2601.18987v5 Announce Type: replace-cross Abstract: Determining whether a program terminates is a central problem in computer science. Turing's …

大语言模型停机问题程序终止形式化推理理论计算机

📝 深度技术 Hacker News LLM 2026-05-26

Amdahl's Law for LLM generated code

用阿姆达尔定律直击LLM生成代码的核心瓶颈：再高效的生成也逃不过逐行审计的低效，根本性限制了加速比。

LLMs may theoretically be able to generate millions of correct lines of code. But for any important code the only way to know that it's correct is to …

llm 代码生成阿姆达尔定律代码审计局限性

🤖 AI·大模型 arXiv NLP 2026-05-20

Language models fail at extended rule following

最新研究揭示大语言模型在复杂多步规则遵循上的显著失败，挑战现有能力边界

arXiv:2605.02028v2 Announce Type: replace Abstract: Large language models are highly capable of answering difficult questions by retrieving, recombini…

语言模型规则遵循 ai局限性深度学习研究大模型评估

📅 日期

2026-05-20 2026-05-19

🐂 牛哥精选