Shedding Light on Depth: Explainability Assessment in Monocular Depth Estimation

📅 2025-09-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Monocular depth estimation (MDE) models suffer from a lack of interpretable evaluation, hindering their trustworthy deployment in safety-critical applications. This work systematically investigates the decision attribution mechanisms underlying MDE networks—from input images to predicted depth maps—and introduces Attribution Fidelity, the first metric to quantitatively measure consistency between attribution maps and depth predictions. It effectively identifies unreliable explanations overlooked by conventional metrics such as Infidelity. We empirically evaluate Saliency Maps, Integrated Gradients, and Attention Rollout across both lightweight and deep MDE architectures: Saliency Maps achieve superior performance on lightweight models, while Attention Rollout demonstrates greater robustness on deeper ones. Results show that Attribution Fidelity significantly improves the accuracy of reliability assessment for attribution methods, establishing a reproducible, comparable benchmark framework for explainability analysis of MDE models.

Technology Category

Application Category

📝 Abstract
Explainable artificial intelligence is increasingly employed to understand the decision-making process of deep learning models and create trustworthiness in their adoption. However, the explainability of Monocular Depth Estimation (MDE) remains largely unexplored despite its wide deployment in real-world applications. In this work, we study how to analyze MDE networks to map the input image to the predicted depth map. More in detail, we investigate well-established feature attribution methods, Saliency Maps, Integrated Gradients, and Attention Rollout on different computationally complex models for MDE: METER, a lightweight network, and PixelFormer, a deep network. We assess the quality of the generated visual explanations by selectively perturbing the most relevant and irrelevant pixels, as identified by the explainability methods, and analyzing the impact of these perturbations on the model's output. Moreover, since existing evaluation metrics can have some limitations in measuring the validity of visual explanations for MDE, we additionally introduce the Attribution Fidelity. This metric evaluates the reliability of the feature attribution by assessing their consistency with the predicted depth map. Experimental results demonstrate that Saliency Maps and Integrated Gradients have good performance in highlighting the most important input features for MDE lightweight and deep models, respectively. Furthermore, we show that Attribution Fidelity effectively identifies whether an explainability method fails to produce reliable visual maps, even in scenarios where conventional metrics might suggest satisfactory results.
Problem

Research questions and friction points this paper is trying to address.

Assessing explainability methods for monocular depth estimation models
Evaluating visual explanation quality through selective pixel perturbation
Introducing Attribution Fidelity metric to validate feature attribution reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Feature attribution methods for depth estimation
Attribution Fidelity metric for explanation evaluation
Selective pixel perturbation analysis technique
🔎 Similar Papers
No similar papers found.
L
Lorenzo Cirillo
Sapienza University of Rome, Rome, Italy
C
Claudio Schiavella
Sapienza University of Rome, Rome, Italy
Lorenzo Papa
Lorenzo Papa
Research Fellow in AI4EO at ESA / ESRIN Φ-lab
Deep LearningComputer Vision
P
Paolo Russo
Sapienza University of Rome, Rome, Italy
Irene Amerini
Irene Amerini
Sapienza Università di Roma, Italy
Multimedia forensics and security