AI Benchmark Democratization and Carpentry

By: Gregor von Laszewski, Wesley Brewer, Jeyan Thiyagalingam, Juri Papay, Armstrong Foundjem, Piotr Luszczek, Murali Emani, Shirley V. Moore, Vijay Janapa Reddi, Matthew D. Sinclair, Sebastian Lobentanzer, Sujata Goswami, Benjamin Hawks, Marco Colombo, Nhan Tran, Christine R. Kirkpatrick, Abdulkareem Alsudais, Gregg Barrett, Tianhao Li, Kirsten Morehouse, Shivaram Venkataraman, Rutwik Jain, Kartik Mathur, Victor Lu, Tejinder Singh, Khojasteh Z. Mirza, Kongtao Chen, Sasidhar Kunapuli, Gavin Farrell, Renato Umeton, Geoffrey C. Fox

Published: 2025-12-15

View on arXiv →

#cs.AI

Abstract

This paper advocates for dynamic and inclusive benchmarking to ensure AI evaluation keeps pace with its evolution, supporting responsible, reproducible, and accessible AI deployment. It aims to improve how AI systems are assessed for real-world applications and enable informed, context-sensitive decisions.

FEEDBACK

Projects

No projects yet