VideoMaMa: Mask-Guided Video Matting via Generative Prior

By: Sangbeom Lim, Seoung Wug Oh, Jiahui Huang, Heeji Yoon, Seungryong Kim, Joon-Young Lee

Published: 2026-01-20

View on arXiv →
#cs.AI

Abstract

Generalizing video matting models to real-world videos remains a significant challenge due to the scarcity of labeled data. We present VideoMaMa, a novel mask-guided video matting framework that converts coarse segmentation masks into pixel-accurate alpha mattes by leveraging pretrained video diffusion models. VideoMaMa demonstrates strong zero-shot generalization to real-world footage, even when trained solely on synthetic data. This approach includes a scalable pseudo-labeling pipeline for large-scale video matting and the construction of the Matting Anything in Video (MA-V) dataset, suitable for professional video editing and content creation.

FEEDBACK

Projects

No projects yet

VideoMaMa: Mask-Guided Video Matting via Generative Prior | ArXiv Intelligence