AgentSearchBench: A Benchmark for AI Agent Search in the Wild.

By: Bin Wu, Arastun Mammadli, Xiaoyu Zhang, Emine Yilmaz

Published: 2026-04-27

View on arXiv →
#cs.AI

Abstract

This paper presents AgentSearchBench, a new benchmark designed to evaluate the performance of AI agents in complex, real-world search scenarios, providing a robust framework for assessing agent capabilities in unconstrained environments.

FEEDBACK

Projects

No projects yet

AgentSearchBench: A Benchmark for AI Agent Search in the Wild. | ArXiv Intelligence