1
Hybrid Training for Vision-Language-Action Models
ICLR 2026论文提出混合训练框架,统一视觉-语言-动作模型,提升多模态具身智能表现。
arXiv:2510.00600v2 Announce Type: replace-cross Abstract: Using Large Language Models to produce intermediate thoughts, a.k.a. Chain-of-thought (CoT),…