DA-DPO: Cost-efficient Difficulty-aware Preference Optimization for Reducing MLLM Hallucinations

By: Longtian Qiu, Shan Ning, Chuyu Zhang, Jiaxuan Sun, Xuming He

Published: 2026-01-26

View on arXiv →
#cs.AI

Abstract

This work presents DA-DPO, a cost-efficient and difficulty-aware preference optimization method aimed at significantly reducing hallucinations in Multimodal Large Language Models (MLLMs). By optimizing based on content difficulty, the approach improves the factual consistency and reliability of MLLM outputs.

FEEDBACK

Projects

No projects yet

DA-DPO: Cost-efficient Difficulty-aware Preference Optimization for Reducing MLLM Hallucinations | ArXiv Intelligence