When RL Meets Adaptive Speculative Training: A Unified Training-Serving System
By: Junxiong Wang, Fengxiang Bie, Jisen Li, Zhongzhu Zhou, Zelei Shao, Yubo Wang, Yinghui Liu, Qingyang Wu, Avner May, Sri Yanamandra, Yineng Zhang, Ce Zhang, Tri Dao, Percy Liang, Ben Athiwaratkun, Shuaiwen Leon Song, Chenfeng Xu, Xiaoxia Wu
Published: 2026-02-09
View on arXiv →#cs.AI
Abstract
This paper proposes a unified training-serving system that integrates reinforcement learning (RL) with adaptive speculative training. The approach aims to optimize the deployment and continuous learning of AI models in production environments, leading to more efficient resource utilization and improved performance for real-time applications where rapid adaptation is crucial.