GovBench: Benchmarking LLM Agents for Real-World Data Governance Workflows
Автори: Zhou Liu, Zhaoyang Han, Guochen Yan, Hao Liang, Bohan Zeng, Xing Chen, Yuanfeng Song, Wentao Zhang
Опубліковано: 2025-12-05
Переглянути на arXiv →Анотація
GovBench introduces a benchmark for evaluating Large Language Model agents in real-world data governance workflows, which is crucial for the deployment of trustworthy AI in regulated environments.