1
Minerva-Ego: Spatiotemporal Hints for Egocentric Video Understanding
AI视频推理新突破:用时空提示增强第一人称视频理解,还附带中间步骤评估,填补了基准空白。
arXiv:2605.15342v1 Announce Type: cross Abstract: Video reasoning models are a core component of egocentric and embodied agents. However, standard ben…