🤖 AI Summary
This work addresses the challenge in multimodal vehicle re-identification where uneven modality quality distributions often lead to fusion conflicts, making it difficult to simultaneously preserve intra-class consistency and account for modality heterogeneity. To this end, the authors propose a decoupled collaborative and guided fusion representation framework, which, for the first time, partitions multimodal fusion into two scenario-adaptive strategies. A Dynamic Confidence-Decoupled Weighting (DCDW) mechanism is introduced to distinguish between balanced and unbalanced quality samples, applying a Collaborative Fusion Module (CFM) for the former and a Guided Fusion Module (GFM) for the latter. By integrating multimodal interaction with difference amplification techniques, the approach enables interference-free modality reweighting. Extensive experiments on WMVeID863, MSVR310, and RGBNT100 benchmarks demonstrate significant performance gains over existing methods, validating the effectiveness and novelty of the proposed framework.
📝 Abstract
Multi-modal vehicle Re-Identification (ReID) aims to leverage complementary information from RGB, Near Infrared (NIR), and Thermal Infrared (TIR) modalities to retrieve the same vehicle. The challenges of multi-modal vehicle ReID arise from the uncertainty of modality quality distribution induced by inherent discrepancies across modalities, resulting in distinct conflicting fusion requirements for data with balanced and unbalanced quality distributions. Existing methods handle all multi-modal data within a single fusion model, overlooking the different needs of the two data types and making it difficult to decouple the conflict between intra-class consistency and inter-modal heterogeneity. To this end, we propose Disentangle Collaboration and Guidance Fusion Representations for Multi-modal Vehicle ReID (DCG-ReID). Specifically, to disentangle heterogeneous quality-distributed modal data without mutual interference, we first design the Dynamic Confidence-based Disentangling Weighting (DCDW) mechanism: dynamically reweighting three-modal contributions via interaction-derived modal confidence to build a disentangled fusion framework. Building on DCDW, we develop two scenario-specific fusion strategies: (1) for balanced quality distributions, Collaboration Fusion Module (CFM) mines pairwise consensus features to capture shared discriminative information and boost intra-class consistency; (2) for unbalanced distributions, Guidance Fusion Module (GFM) implements differential amplification of modal discriminative disparities to reinforce dominant modality advantages, guide auxiliary modalities to mine complementary discriminative info, and mitigate inter-modal divergence to boost multi-modal joint decision performance. Extensive experiments on three multi-modal ReID benchmarks (WMVeID863, MSVR310, RGBNT100) validate the effectiveness of our method. Code will be released upon acceptance.