Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States

This paper proposes a novel approach to enhance the capabilities of large language models during post-training by reintroducing Markov states. It argues that current post-training paradigms often hit a "capability ceiling" due to limitations in capturing complex sequential dependencies. By integrating Markov states, the model can better learn and generalize across diverse tasks, leading to significant improvements in reasoning, planning, and long-context understanding, unlocking new potentials for LLM applications.

Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States

Abstract

Projects