Enhancing Policy Learning with World-Action Model
By: Zichang Wang, Xiaochen Li, Shagun Singh, Xiang Li, Chuang Gan, Joshua B. Tenenbaum, S. M. Ali Eslami
Published: 2026-03-31
View on arXiv →#cs.AI
Abstract
This paper presents the World-Action Model (WAM), an action-regularized world model that jointly reasons over future visual observations and the actions that drive state transitions. WAM integrates an inverse dynamics objective into DreamerV2 to predict actions from latent state transitions, encouraging representations to capture action-relevant structure. It is evaluated on eight manipulation tasks from the CALVIN benchmark, showing improved behavioral cloning success and PPO fine-tuning performance with significantly fewer training steps compared to baselines.