1
Unlocking Dense Metric Depth Estimation in VLMs
新型方法让视觉语言模型突破3D密集几何感知瓶颈,实现高效深度估计。
arXiv:2605.15876v1 Announce Type: new Abstract: Vision-Language Models (VLMs) excel at 2D tasks such as grounding and captioning, yet remain limited i…
新型方法让视觉语言模型突破3D密集几何感知瓶颈,实现高效深度估计。
arXiv:2605.15876v1 Announce Type: new Abstract: Vision-Language Models (VLMs) excel at 2D tasks such as grounding and captioning, yet remain limited i…