Adaptive Confidence Regularization for Multimodal Failure Detection
By: Moru Liu, Hao Dong, Olga Fink, Mario Trapp
Published: 2026-03-03
View on arXiv →Abstract
The deployment of multimodal models in high-stakes domains, such as self-driving vehicles and medical diagnostics, demands not only strong predictive performance but also reliable mechanisms for detecting failures. In this work, we address the largely unexplored problem of failure detection in multimodal contexts. We propose Adaptive Confidence Regularization (ACR), a novel framework specifically designed to detect multimodal failures. Our approach is driven by a key observation: in most failure cases, the confidence of the multimodal prediction should be higher when all modalities agree, and lower when they disagree. ACR explicitly models this discrepancy by learning a confidence score that regularizes the multimodal prediction. Our experiments demonstrate that ACR significantly improves failure detection performance on various multimodal tasks and datasets, especially in challenging real-world scenarios with noisy and incomplete data.