Agent psychometrics: Task-level performance prediction in agentic coding benchmarks
By: Chris Ge, Daria Kryvosheieva, Daniel Fried
Published: 2026-04-23
View on arXiv →#cs.AI
Abstract
This paper explores agent psychometrics, focusing on predicting task-level performance in agentic coding benchmarks. It delves into methodologies for evaluating the capabilities of AI coding agents beyond simple pass/fail rates, aiming to understand their strengths, weaknesses, and potential for real-world software development. By developing metrics and predictive models for agent performance, the research contributes to building more reliable and efficient AI assistants for programmers, enhancing the overall productivity and quality of software engineering processes.