GovBench: Benchmarking LLM Agents for Real-World Data Governance Workflows
By: Zhou Liu, Zhaoyang Han, Guochen Yan, Hao Liang, Bohan Zeng, Xing Chen, Yuanfeng Song, Wentao Zhang
Published: 2025-12-05
View on arXiv →Abstract
GovBench introduces a benchmark for evaluating Large Language Model agents in real-world data governance workflows, which is crucial for the deployment of trustworthy AI in regulated environments.