1
Mixture-of-Experts Can Surpass Dense LLMs Under Strictly Equal Resource
MoE架构在严格等资源条件下首次证明超越稠密大模型,ICLR 2026最新研究。
arXiv:2506.12119v2 Announce Type: replace Abstract: Mixture-of-Experts (MoE) language models dramatically expand model capacity and achieve remarkable…