1
Boosting Omni-Modal Language Models: Staged Post-Training with Visually Debiased Evaluation
针对全模态语言模型的视觉捷径问题,提出分阶段后训练与视觉去偏评估方法,提升多模态理解的真实性
arXiv:2605.12034v2 Announce Type: replace-cross Abstract: Omni-modal language models are intended to jointly understand audio, visual inputs, and lang…