1
AMO: Adaptive Muon Orthogonalization
提出自适应Muon正交化方法,有望优化深度学习训练过程。
arXiv:2605.17806v1 Announce Type: new Abstract: Muon has recently emerged as a competitive alternative to AdamW for large-scale pre-training, with ort…
提出自适应Muon正交化方法,有望优化深度学习训练过程。
arXiv:2605.17806v1 Announce Type: new Abstract: Muon has recently emerged as a competitive alternative to AdamW for large-scale pre-training, with ort…
提出异步线性最小化预言机动量方法,加速大规模组合优化
arXiv:2605.18174v1 Announce Type: new Abstract: Muon has recently emerged as a strong alternative to AdamW for training neural networks, with encourag…