1
BEAM: Binary Expert Activation Masking for Dynamic Routing in MoE
一种新型动态路由方法通过二进制专家激活掩码减少MoE冗余计算,无需重训即可加速推理。
arXiv:2605.14438v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) architectures enhance the efficiency of large language models by activating o…