Robo-Dopamine: General Process Reward Modeling for High-Precision Robotic Manipulation
By: Huajie Tan, Sixiang Chen, Yijie Xu, Zixiao Wang, Yuheng Ji, Cheng Chi, Yaoxu Lyu, Zhongxia Zhao, Xiansheng Chen, Peterson Co, Shaoxuan Xie, Guocai Yao, Pengwei Wang, Zhongyuan Wang, Shanghang Zhang
Published: 2025-12-29
View on arXiv →#cs.AI
Abstract
The paper presents Robo-Dopamine, a framework for high-precision robotic manipulation using reinforcement learning (RL). It introduces Dopamine-Reward, a novel multi-view, step-aware process reward model, and Dopamine-RL, a robust policy learning framework with theoretically-sound Policy-Invariant Reward Shaping. This approach efficiently learns dense reward signals, accelerates policy optimization, and avoids semantic traps, making RL practical for real-world robotics.