Discovering High Level Patterns from Simulation Traces
从仿真轨迹中挖掘高层次模式,为复杂系统行为分析提供全新视角。
arXiv:2602.10009v2 Announce Type: replace Abstract: Large Language Models (LLMs) are unable to reliably reason about specific physical systems. Attemp…
从仿真轨迹中挖掘高层次模式,为复杂系统行为分析提供全新视角。
arXiv:2602.10009v2 Announce Type: replace Abstract: Large Language Models (LLMs) are unable to reliably reason about specific physical systems. Attemp…
论文提出ACC方法,编译智能体轨迹以高效进行长上下文训练,为AI大模型的长文本处理提供新思路。
arXiv:2605.21850v1 Announce Type: new Abstract: Recent development of agents has renewed demand for long-context reasoning capacity of LLMs. However, …
揭示RLVR训练中参数轨迹的秩一结构,仅需极小规模训练即可外推LLM推理能力,颠覆传统认知。
arXiv:2605.21468v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has become a dominant paradigm for improving rea…
从视频中自动合成海量GUI交互轨迹,破解GUI Agent预训练数据稀缺难题,让智能体更好理解真实应用。
arXiv:2605.14747v1 Announce Type: cross Abstract: Recent advances in multimodal large language models have driven growing interest in graphical user i…
提出TSR轨迹搜索展开方法,精准提升LLM Agent在多轮交互中的强化学习表现
arXiv:2602.11767v3 Announce Type: replace-cross Abstract: Advances in large language models (LLMs) are driving a shift toward using reinforcement lear…
一篇统一SFT、DAgger、离线RL和OPD视角的LLM蒸馏论文,解耦KL与轨迹,为模型优化提供新理论框架。
arXiv:2605.16826v1 Announce Type: new Abstract: Knowledge distillation is central to LLM post-training, yet its design space remains poorly understood…
用LLM结合蒙特卡洛方法建模情感轨迹与潜在歧义,为理解人际动态提供了新颖的计算视角。
arXiv:2601.03645v2 Announce Type: replace Abstract: Emotional coordination is a core property of human interaction that shapes how relational meaning …
从流式应用部分使用轨迹提前预判诈骗行为,新论文提出ORACLE方法,为在线安全提供前瞻性方案。
arXiv:2605.16363v1 Announce Type: new Abstract: Smartphone scams are increasingly prevalent and typically manifest as multi-stage, cross-application p…
提出f-轨迹平衡损失族,统一了GFlowNets和LLM的on/off-policy训练,梯度对应KL散度,低方差高效。
arXiv:2605.15417v1 Announce Type: cross Abstract: In GFlowNets and variational inference, it has been shown that the mean square error between target …
一键访问arXiv预印本,快速浏览论文摘要和PDF,支持HTML实验版,方便学术探索。
arXiv:2605.15454v1 Announce Type: cross Abstract: Reasoning-trained language models often spend more tokens on harder problems, but longer chains of t…
提出轨迹级评估框架LEAP,首次量化LLM在科学设计中的迭代学习过程,而非仅关注结果快照。
arXiv:2605.15341v1 Announce Type: cross Abstract: LLMs are increasingly deployed in autonomous laboratories, under the assumption that their domain pr…
用链式智能体(Chain-of-Agents)增强大模型处理长病历时序推理,实现肺癌风险精准预测
arXiv:2510.10454v2 Announce Type: replace Abstract: Large language models (LLMs) offer a generalizable approach for modeling patient trajectories, but…
用JSON-Bag模型将游戏轨迹转化为通用向量表示,通过JSD度量和原型近邻搜索高效评估轨迹相似性。
arXiv:2508.00712v2 Announce Type: replace Abstract: We introduce JSON Bag-of-Tokens model (JSON-Bag) as a method to generically represent game traject…
多智能体系统一旦出现关键错误就会级联溃败。AgentForesight将故障诊断从事后归因推入在线审计——在轨迹执行中实时识别首个决定性错误,7B模型超越基线,为复杂Agent系统提供宝贵的安全护栏。
arXiv:2605.08715v2 Announce Type: replace-cross Abstract: LLM-based multi-agent systems are increasingly deployed on long-horizon tasks, but a single …
处方推荐是医疗AI落地的关键痛点,RxEval首次把评估细化到「药名-剂量-途径」三要素组合的处方级多项选择。16个主流LLM中最佳精确匹配仅46%,前沿模型仍会忽略关键患者信息——距离可信处方AI还有巨大鸿沟。
arXiv:2605.14543v1 Announce Type: cross Abstract: Inpatient medication recommendation requires clinicians to repeatedly select specific medications, d…