1
Mixing Times of Glauber Dynamics on Masked Language Models
从统计物理视角分析遮蔽语言模型中Glauber动力学的混合时间,为理解MLM的采样行为提供理论依据。
arXiv:2605.16378v1 Announce Type: new Abstract: Masked language models (MLMs) define local conditional distributions over tokens but do not, in genera…