Scaling Laws for Energy Efficiency of Local LLMs
By: Ander Alvarez, Alessandro Genuardi, Nilotpal Sinha, Antonio Tiene, Samuel Mugel, Román Orús
Published: 2025-12-18
View on arXiv →#cs.AI
Abstract
Deploying local large language models and vision-language models on edge devices requires balancing accuracy with constrained computational and energy budgets. This paper systematically benchmarks LLMs and VLMs on CPU tiers, uncovering scaling laws for computational cost with token length and image resolution, and showing that quantum-inspired compression can reduce energy consumption by up to 62% while preserving accuracy, enabling sustainable edge inference.