LongVie 2: Multimodal Controllable Ultra-Long Video World Model
By: Jianxiong Gao, Zhaoxi Chen, Xian Liu, Junhao Zhuang, Chengming Xu, Jianfeng Feng, Yu Qiao, Yanwei Fu, Chenyang Si, Ziwei Liu
Published: 2025-12-16
View on arXiv →#cs.AI
Abstract
This paper introduces LongVie 2, a multimodal controllable ultra-long video world model. It focuses on generating and understanding extended video sequences with high fidelity and controllability. This research has significant real-world applications in areas like video content creation, realistic simulation environments, and advanced human-computer interaction, pushing the boundaries of generative AI for video.