1
Language models fail at extended rule following
最新研究揭示大语言模型在复杂多步规则遵循上的显著失败,挑战现有能力边界
arXiv:2605.02028v2 Announce Type: replace Abstract: Large language models are highly capable of answering difficult questions by retrieving, recombini…
最新研究揭示大语言模型在复杂多步规则遵循上的显著失败,挑战现有能力边界
arXiv:2605.02028v2 Announce Type: replace Abstract: Large language models are highly capable of answering difficult questions by retrieving, recombini…
揭示FFN架构稀疏性如何重塑注意力计算,影响小型Transformer模型学习机制。
arXiv:2605.09403v2 Announce Type: replace-cross Abstract: Architectural choices inside the Transformer feedforward network (FFN) block do not merely a…