MiniGPT: Rebuilding GPT from First Principles
从零复现GPT核心机制,基于PyTorch实现简洁自回归语言模型,AI学习者必读的底层论文教程。
arXiv:2605.17398v1 Announce Type: cross Abstract: This paper presents MiniGPT, a compact from-scratch implementation of GPT-style autoregressive langu…
从零复现GPT核心机制,基于PyTorch实现简洁自回归语言模型,AI学习者必读的底层论文教程。
arXiv:2605.17398v1 Announce Type: cross Abstract: This paper presents MiniGPT, a compact from-scratch implementation of GPT-style autoregressive langu…
提出Delta Forcing方法,解决交互式自回归视频生成中响应性与稳定性的平衡难题。
arXiv:2605.14382v2 Announce Type: replace Abstract: Interactive real-time autoregressive video generation is essential for applications such as conten…
自回归序列的矩阵解耦集中不等式,为稀疏长上下文奖励提供无维度保证,理论创新突破。
arXiv:2605.06017v2 Announce Type: replace Abstract: Sequence-level evaluations in autoregressive Large Language Models (LLMs) rely on highly dependent…
这篇论文提出了利用自回归序列模型进行条件属性估计的新方法,直击生成模型在全局结构控制上的痛点,值得关注。
arXiv:2605.14004v1 Announce Type: new Abstract: Generative models are often trained with a next-token prediction objective, yet many downstream applic…
提出运动感知缓存复用策略,显著加速自回归视频生成过程。
arXiv:2605.01725v2 Announce Type: replace-cross Abstract: Autoregressive video generation paradigms offer theoretical promise for long video synthesis…
揭秘LLM从左到右理解的弱点:仅需在左侧加噪声,就能轻松绕过黑盒大模型的安全护栏。
arXiv:2410.02832v2 Announce Type: replace-cross Abstract: This paper proposes a simple yet effective jailbreak attack named FlipAttack against black-b…
提出用JEPAs审计LLM微调:预测隐含表示而非输出,以提升任务指标。
arXiv:2605.15394v1 Announce Type: cross Abstract: Joint-embedding predictive architectures (JEPAs) propose that a model should learn more useful abstr…
结合概率先验的变分自回归网络,一种解决蒙特卡洛临界慢化的新方法
arXiv:2605.16020v1 Announce Type: new Abstract: Monte Carlo methods are essential across diverse scientific fields, yet their efficiency is frequently…