GovBench: Benchmarking LLM Agents for Real-World Data Governance Workflows

By: Zhou Liu, Zhaoyang Han, Guochen Yan, Hao Liang, Bohan Zeng, Xing Chen, Yuanfeng Song, Wentao Zhang

Published: 2025-12-05

View on arXiv →

Abstract

GovBench introduces a benchmark for evaluating Large Language Model agents in real-world data governance workflows, which is crucial for the deployment of trustworthy AI in regulated environments.

FEEDBACK

Projects

No projects yet