1
VLMs Trace Without Tracking: Diagnosing Failures in Visual Path Following
研究揭示视觉语言模型在追踪连续路径时的隐藏缺陷,诊断失败根源。
arXiv:2605.15672v1 Announce Type: cross Abstract: Vision-language models (VLMs) achieve strong performance on multimodal benchmarks, but may still lac…