🤖 AI Summary
Autonomous driving multimodal fusion models suffer from a decision-making black-box problem, hindering quantitative assessment of individual modality contributions (e.g., camera, radar, LiDAR) across network layers. To address this, we propose the first model-agnostic, post-hoc, and layer-wise modality attribution method. Our approach leverages layer-wise modality decomposition coupled with structured perturbation analysis to enable interpretable, architecture-agnostic fusion diagnostics. It supports diverse input configurations—including camera-radar, camera-LiDAR, and tri-modal fusion—while preserving the performance of high-capacity models. The method delivers precise, per-layer quantification of modality contributions alongside intuitive visualizations. Extensive experiments demonstrate its effectiveness and generalizability across multiple fusion paradigms and benchmarks. Code is publicly available.
📝 Abstract
In autonomous driving, transparency in the decision-making of perception models is critical, as even a single misperception can be catastrophic. Yet with multi-sensor inputs, it is difficult to determine how each modality contributes to a prediction because sensor information becomes entangled within the fusion network. We introduce Layer-Wise Modality Decomposition (LMD), a post-hoc, model-agnostic interpretability method that disentangles modality-specific information across all layers of a pretrained fusion model. To our knowledge, LMD is the first approach to attribute the predictions of a perception model to individual input modalities in a sensor-fusion system for autonomous driving. We evaluate LMD on pretrained fusion models under camera-radar, camera-LiDAR, and camera-radar-LiDAR settings for autonomous driving. Its effectiveness is validated using structured perturbation-based metrics and modality-wise visual decompositions, demonstrating practical applicability to interpreting high-capacity multimodal architectures. Code is available at https://github.com/detxter-jvb/Layer-Wise-Modality-Decomposition.