A Pragmatic VLA Foundation Model
By: Wei Wu, Fan Lu, Yunnan Wang
Published: 2026-01-26
View on arXiv →#cs.AI
Abstract
LingBot-VLA is a Vision-Language-Action foundation model pre-trained on 20,000 hours of real-world multi-embodiment robot data. It demonstrates that VLA model performance scales with increasing data volume without saturation, achieving superior success rates on a 100-task real-world benchmark across three robot platforms, and improving training efficiency. This directly advances practical robotics.