HiDe: Rethinking The Zoom-IN method in High Resolution MLLMs via Hierarchical Decoupling
重新思考高分辨率多模态大模型中的Zoom-IN方法,提出Hierarchical Decoupling框架,显著提升视觉理解性能。
arXiv:2510.00054v2 Announce Type: replace Abstract: Multimodal Large Language Models (MLLMs) have made significant strides in visual understanding tas…