StackPlanner: A Centralized Hierarchical Multi-Agent System with Task-Experience Memory Management
By: Ruizhe Zhang, Xinke Jiang, Zhibang Yang, Zhixin Zhang, Jiaran Gao, Yuzhen Xiao, Hongbin Lai, Xu Chu, Junfeng Zhao, Yasha Wang
Published: 2026-01-09
View on arXiv →Abstract
Centralized multi-agent systems based on LLMs often struggle with unstable long-horizon collaboration due to a lack of memory management, leading to context bloat, error accumulation, and poor cross-task generalization. StackPlanner is proposed as a hierarchical multi-agent framework with explicit memory control. It decouples high-level coordination from subtask execution with active task-level memory control and learns to retrieve and exploit reusable coordination experience via structured experience memory and reinforcement learning. Experiments demonstrate its effectiveness in enabling reliable long-horizon multi-agent collaboration on deep-search and agent system benchmarks.