EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation

This paper proposes EVATok, a novel adaptive length video tokenization method designed for efficient visual autoregressive generation. It aims to improve the efficiency of video generation models by dynamically adjusting token lengths, leading to better performance and reduced computational costs, particularly useful for high-quality video synthesis and editing applications. This work was accepted by CVPR 2026.

EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation

Abstract

Projects