ConvoLearn: A Dataset of Constructivist Tutor-Student Dialogue
By: Mayank Sharma, Roy Pea, Hari Subramonyam
Published: 2026-01-13
View on arXiv →Abstract
Large Language Models (LLMs) in educational applications often reveal solutions rather than fostering dialogic learning. This paper introduces ConvoLearn, a dataset grounded in knowledge building theory that operationalizes six pedagogical dimensions: cognitive engagement, formative assessment, accountability, cultural responsiveness, metacognition, and power dynamics. The semi-synthetic dataset of 1250 tutor-student dialogues in middle school Earth Science demonstrates that training LLMs on ConvoLearn meaningfully shifts their behavior towards knowledge-building strategies. Human evaluations show significant outperformance over base models, establishing a framework for developing and evaluating constructivist AI tutors.