FlexLLM: Composable HLS Library for Flexible Hybrid LLM Accelerator Design

By: Jiahao Zhang, Zifan He, Nicholas Fraser, Michaela Blott, Yizhou Sun, Jason Cong

Published: 2026-01-23

View on arXiv →
#cs.AI

Abstract

This paper introduces FlexLLM, a composable High-Level Synthesis (HLS) library designed for flexible hybrid Large Language Model (LLM) accelerator design. It aims to streamline the development of efficient and adaptable hardware for LLM inference, addressing the growing computational demands of state-of-the-art AI models for real-world deployment.

FEEDBACK

Projects

No projects yet