Training-Time Action Conditioning for Efficient Real-Time Robot Control

Researchers at Physical Intelligence developed a method for real-time robot control that shifts action chunk conditioning from inference-time to training-time, achieving lower latency and improved robustness for Vision-Language-Action (VLA) models, especially under high inference delays. This approach reduces end-to-end latency by an average of 27ms compared to prior methods while maintaining task performance on complex real-world tasks.

Training-Time Action Conditioning for Efficient Real-Time Robot Control

Abstract

Projects