1
LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs
提出高效视觉编码器,解决Video LLM长视频中视觉token爆炸难题,突破帧扩展瓶颈。
arXiv:2605.17260v1 Announce Type: new Abstract: The fundamental challenge in scaling Video Large Language Models (Video LLMs) to long-form video lies …