Model Collapse as Cultural Evolution
从文化演化理论解释大模型自我训练导致的模型崩溃,提出五个可证伪预测,填补语言学空白。
arXiv:2605.23054v1 Announce Type: cross Abstract: Model collapse, the progressive degradation of LLMs trained on their own outputs, has been character…
从文化演化理论解释大模型自我训练导致的模型崩溃,提出五个可证伪预测,填补语言学空白。
arXiv:2605.23054v1 Announce Type: cross Abstract: Model collapse, the progressive degradation of LLMs trained on their own outputs, has been character…
AI Agent设计核心:从2-3个工具的小循环起步,用限制迭代观察轨迹,否则只是通用LLM。
TL;DR The model matters, but tools matter at least as much. Weak tool descriptions are one of the easiest agent failures to diagnose, and one of the m…
用迭代奖励引导后训练,让表格语言模型也能自我进化、持续提升性能。
arXiv:2604.18966v2 Announce Type: replace Abstract: Tabular language models can generate synthetic tables by modeling rows as token sequences, but the…
真实用户反馈驱动迭代,看PasteCheck如何在v1.3中精准填补功能缺口。
A few days ago I launched PasteCheck — a free tool to paste code and see errors highlighted instantly. After launch I found real gaps and shipped v1.3…
多智能体LLM工作流的离线评估与迭代优化新框架,即将亮相ACL 2026,助力复杂协作场景调优。
arXiv:2605.18032v1 Announce Type: new Abstract: Multi-agent LLM workflows -- systems composed of multiple role-specific LLM calls -- often outperform …
无需人工标注数据,LLM通过迭代教练-玩家推理实现强化学习突破
arXiv:2602.02979v2 Announce Type: cross Abstract: Large Language Models (LLMs) have demonstrated strong potential in complex reasoning, yet their prog…
强化学习新突破:梯度迭代TD学习算法,解决半梯度更新缺陷,提升长期决策稳定性
arXiv:2603.07833v2 Announce Type: replace-cross Abstract: Temporal-difference (TD) learning is highly effective at controlling and evaluating an agent…
一种生物启发的时间稀疏AI架构,将推理转化为激素闭环迭代,实现节能内省,颠覆传统前馈范式。
arXiv:2605.13872v1 Announce Type: cross Abstract: This article introduces S-AI-Recursive, a bio-inspired Sparse Artificial Intelligence architecture i…
OpenAI提出迭代放大技术,通过任务分解实现超人类复杂目标的AI安全方法,虽处早期但具可扩展潜力。
We’re proposing an AI safety technique called iterated amplification that lets us specify complicated behaviors and goals that are beyond human scale,…
Qwen3.7-Max-Preview 和 Plus 预览版同时上线,大幅提升复杂推理与响应速度,MoE 架构再次刷新能力上限
文本领域、视觉领域双双国产第一
针对企业LLM对话部署的提示词可靠性挑战,提出迭代模拟与监控框架PRISM,解决生产环境中的非确定性行为漂移。
arXiv:2605.15665v1 Announce Type: new Abstract: Deploying large language model (LLM)-driven conversational agents in enterprise settings requires prom…
理论证明使用次指数尾随机梯度的高精度对数凹采样可实现多对数复杂度,与凸优化形成显著对比
arXiv:2602.14342v2 Announce Type: replace-cross Abstract: We show that high-accuracy guarantees for log-concave sampling -- that is, iteration and que…
提出轨迹级评估框架LEAP,首次量化LLM在科学设计中的迭代学习过程,而非仅关注结果快照。
arXiv:2605.15341v1 Announce Type: cross Abstract: LLMs are increasingly deployed in autonomous laboratories, under the assumption that their domain pr…
从完美主义到快速实验的转变,教你用最小可行产品打破项目烂尾魔咒。
An honest post about perfectionism, unfinished projects, and why my portfolio is named after a vegetable The worst advice I ever got about building pr…
提出最小化agent基线,系统对比AI定理证明器架构,核心特性包括迭代改进、库搜索与上下文管理。
arXiv:2602.24273v3 Announce Type: replace Abstract: We propose a minimal agentic baseline that enables systematic comparison across different AI-based…
理想全新L9上市,增程旗舰能否夺回领先?45.98万元起,双色车身、性能提升引关注
理想汽车CEO 李想 能让李想豁出去跳舞挣眼球的发布会,仅在全新一代理想L9上市。 5月15日,理想发布会了全新一代理想L9,推出两个版本——理想L9 livis,售价 50.98万元;理想L9 Ultra版,售价45.98万元。且理想给出了现金2万元的首销权益,截止于6月30日。 36氪汽车曾报道…
卢伟冰确认玄戒芯片今年迭代,性能强劲,搭载优秀产品。
IT之家 5 月 16 日消息,今天下午进行的爆料直播中,小米集团合伙人、总裁,手机部总裁,小米品牌总经理卢伟冰回应了“玄戒芯片”相关的问题。 卢伟冰表示: 玄戒芯片今年肯定会迭代 ,但是呢外界确实有很多的传闻,这些传闻大家可以不用太相信。 具体的信息现在我不便去透露,大家可以期待一下, 一定是一块…
Coxwave借助Vercel和Next.js将部署时间压缩85%,迭代频率从周级跃升至日级——在GenAI赛道上,交付速度就是护城河。
Coxwave helps enterprises build GenAI products that work at scale. With their consulting arm, AX, and their analytics platform, Align, they support so…