Masked Depth Modeling for Spatial Perception
By: Bin Tan, Changjiang Sun, Xiage Qin, Hanat Adai, Zelin Fu, Tian Zhou, Han Zhang, Yinghao Xu, Xing Zhu, Yujun Shen, Nan Xue
Published: 2026-01-25
View on arXiv →#cs.AI
Abstract
Robbyant introduces Masked Depth Modeling (MDM), a framework that leverages natural sensor failures in RGB-D cameras as learning signals to generate dense, metric-scale, and pixel-aligned depth maps. The model enhances spatial perception for various AI applications like autonomous driving and robotic manipulation by consistently outperforming commercial sensors and existing methods across diverse benchmarks and real-world scenarios.