Observations and Remedies for Large Language Model Bias in Self-Consuming Performative Loop

By: Yaxuan Wang, Zhongteng Cai, Yujia Bao, Xueru Zhang, Yang Liu

Published: 2026-01-08

View on arXiv →
#cs.AI

Abstract

The rapid advancement of large language models (LLMs) has led to growing interest in using synthetic data to train future models. However, this creates a self-consuming retraining loop, where models are trained on their own outputs and may cause performance drops and induce emerging biases. In this study, we introduce the concept of Self-Consuming Performative Loop (SCPL) and investigate the role of synthetic data in shaping bias during these dynamic iterative training processes. We design a reward-based rejection sampling strategy to mitigate the bias, moving towards more trustworthy self-improving systems.

FEEDBACK

Projects

No projects yet

Observations and Remedies for Large Language Model Bias in Self-Consuming Performative Loop | ArXiv Intelligence