Divergence-Suppressing Couplings for Rectified Flow
提出抑制散度的耦合方法用于整流流,改善生成模型的稳定性和效率
arXiv:2605.17733v1 Announce Type: cross Abstract: The promise of Rectified Flow rests on producing self-generated couplings whose trajectories are str…
提出抑制散度的耦合方法用于整流流,改善生成模型的稳定性和效率
arXiv:2605.17733v1 Announce Type: cross Abstract: The promise of Rectified Flow rests on producing self-generated couplings whose trajectories are str…
针对低活跃用户的推荐不确定性校准方法,有效提升冷启动场景可靠性,已被KDD 2026接收。
arXiv:2605.17788v1 Announce Type: cross Abstract: A fundamental challenge in recommender systems is balancing reliability for Low-Active Users (LAUs) …
探究视觉语言模型中潜在视觉推理的瓶颈,揭示人类式中间视觉步骤的模拟障碍
arXiv:2605.18445v1 Announce Type: cross Abstract: Humans can approach complex visual problems by mentally simulating intermediate visual steps, rather…
研究发现LLM的幻觉并非随机,而是与模型规模和主题频率呈可预测的比例关系。
arXiv:2605.18732v1 Announce Type: cross Abstract: While scaling laws govern aggregate large language model performance, no scaling law has linked fact…
提出可微分自适应稀疏层次注意力机制,显著提升长序列建模效率与计算可扩展性
arXiv:2605.18753v1 Announce Type: cross Abstract: Current hierarchical attention methods, such as NSA and InfLLMv2, select the top-k relevant key-valu…
从理论层面揭示Transformer在噪声与任务级流形上的学习能力,近似与泛化分析带来新洞察
arXiv:2505.03205v3 Announce Type: replace Abstract: Transformers serve as the foundational architecture for large language and video generation models…
提出LLM智能体如何打破平台壁垒,重塑开放互联网生态。
arXiv:2506.23978v3 Announce Type: replace Abstract: While the Internet's core infrastructure was designed to be open and universal, today's applicatio…
用对比学习提升机器遗忘效果,解决现有方法遗忘不彻底的问题,为模型数据移除提供新思路。
arXiv:2509.16391v3 Announce Type: replace Abstract: Machine unlearning (MU) aims to remove the influence of specific "forget" data from a trained mode…
从贝叶斯几何视角重新阐释Transformer注意力机制,揭示其内在概率结构。
arXiv:2512.22471v5 Announce Type: replace Abstract: Transformers often appear to perform Bayesian reasoning in context, but verifying this rigorously …
ACL 2026 论文:知识对齐的学生错误模拟器,精准生成开放式编程任务中的学生典型错误,助力智能教育评估
arXiv:2601.06633v2 Announce Type: replace Abstract: Open-ended tasks, such as coding problems that are common in computer science education, provide d…
提出子1比特量化方法,大幅降低大语言模型存储与计算开销,兼顾效率与性能。
arXiv:2602.06694v2 Announce Type: replace Abstract: Weight-only quantization has become a standard approach for efficiently serving large language mod…
稀疏点云条件3D扩散模型中的电路与分岔现象,揭示安全关键重建的底层机制
arXiv:2602.11130v2 Announce Type: replace Abstract: Sparse point clouds are a common input modality for 3D surface reconstruction, including in safety…
揭示自我对弈仅在自合成数据提供可学习信息增益时才有效演化,为AI训练策略提供关键理论指导。
arXiv:2603.02218v2 Announce Type: replace Abstract: Large language models (LLMs) make it plausible to build systems that improve through self-evolving…
揭秘MoE大模型预训练中剪枝与蒸馏技术,SlimQwen优化效率与性能。
arXiv:2605.08738v2 Announce Type: replace Abstract: Structured pruning and knowledge distillation (KD) are typical techniques for compressing large la…
LLM的机器遗忘并非真正删除,最新研究揭示了其可逆性,挑战现有安全假设。
arXiv:2505.16831v3 Announce Type: replace-cross Abstract: Unlearning in large language models (LLMs) aims to remove specified data, but its efficacy i…
新方法Prompt Reinforcing提升大模型长期规划能力,精准解决复杂任务。
arXiv:2510.05921v3 Announce Type: replace-cross Abstract: Large language models (LLMs) have achieved remarkable success in a wide range of natural lan…
融合LLM与多智能体强化学习,打造弹性的云网络防御框架,有效应对虚拟化带来的安全挑战。
arXiv:2601.07122v2 Announce Type: replace-cross Abstract: While virtualization and resource pooling empower cloud networks with structural flexibility…
ICML 2026 顶会论文:深入 Mixture-of-Experts 语言模型的专家级别内部机制,揭示专家如何协同与对抗。
arXiv:2604.02178v2 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) architectures have become the dominant choice for scaling Large Lan…
提出SMART框架,将预训练模型融入高维非参数变量选择,为微调提供理论基础。
arXiv:2604.12288v2 Announce Type: replace-cross Abstract: Fine-tuning is a widely used strategy for adapting pre-trained models to new tasks, yet its …
提出新方法检测生成式AI中的因果偏差,为提升模型公平性提供理论支撑。
arXiv:2605.11365v2 Announce Type: replace-cross Abstract: Automated systems built on artificial intelligence (AI) are increasingly deployed across hig…