FlexLLM: Composable HLS Library for Flexible Hybrid LLM Accelerator Design
By: Jiahao Zhang, Zifan He, Nicholas Fraser, Michaela Blott, Yizhou Sun, Jason Cong
Published: 2026-01-23
View on arXiv →#cs.AI
Abstract
This paper introduces FlexLLM, a composable High-Level Synthesis (HLS) library designed for flexible hybrid Large Language Model (LLM) accelerator design. It aims to streamline the development of efficient and adaptable hardware for LLM inference, addressing the growing computational demands of state-of-the-art AI models for real-world deployment.