ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

By: Xirui Li, Ming Li, Derry Xu, Wei-Lin Chiang, Ion Stoica, Cho-Jui Hsieh, Tianyi Zhou

Published: 2026-04-21

View on arXiv →
#cs.AI

Abstract

This paper introduces ClawEnvKit, an autonomous pipeline for generating diverse and verified environments for training and evaluating claw-like robotic agents from natural language descriptions. This toolkit streamlines the creation of large-scale benchmarks, addressing the scalability issues of manual environment construction. It comprises a parser, generator, and validator to ensure feasibility, diversity, and consistency of generated environments. The resulting Auto-ClawEval benchmark demonstrates significant cost reduction and improved evaluation scale, showing that harness engineering boosts performance and highlighting the need for continuous evaluation.

FEEDBACK

Projects

No projects yet

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents | ArXiv Intelligence