CKDA: Cross-modality Knowledge Disentanglement and Alignment for Visible-Infrared Lifelong Person Re-identification

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In vision-language cross-modal person re-identification (VI-LReID), interference between modality-specific and modality-shared knowledge leads to collaborative forgetting during continual learning. Existing cross-modal knowledge distillation methods fail to explicitly decouple these two knowledge types, limiting performance. To address this, we propose a Knowledge Decoupling and Alignment (KDA) framework. First, we design a dual-prompt module—comprising a modality-shared prompt and a modality-specific prompt—to model the respective knowledge in orthogonal feature subspaces. Second, we introduce a cross-modal alignment mechanism that jointly distills and aligns representations across sessions and intra-class spaces, leveraging dual-modality prototypes. This approach effectively mitigates knowledge interference and collaborative forgetting, significantly enhancing robustness for 24/7 continual VI-LReID. Our method achieves state-of-the-art performance on four benchmark datasets. Code is publicly available.

Technology Category

Application Category

📝 Abstract
Lifelong person Re-IDentification (LReID) aims to match the same person employing continuously collected individual data from different scenarios. To achieve continuous all-day person matching across day and night, Visible-Infrared Lifelong person Re-IDentification (VI-LReID) focuses on sequential training on data from visible and infrared modalities and pursues average performance over all data. To this end, existing methods typically exploit cross-modal knowledge distillation to alleviate the catastrophic forgetting of old knowledge. However, these methods ignore the mutual interference of modality-specific knowledge acquisition and modality-common knowledge anti-forgetting, where conflicting knowledge leads to collaborative forgetting. To address the above problems, this paper proposes a Cross-modality Knowledge Disentanglement and Alignment method, called CKDA, which explicitly separates and preserves modality-specific knowledge and modality-common knowledge in a balanced way. Specifically, a Modality-Common Prompting (MCP) module and a Modality-Specific Prompting (MSP) module are proposed to explicitly disentangle and purify discriminative information that coexists and is specific to different modalities, avoiding the mutual interference between both knowledge. In addition, a Cross-modal Knowledge Alignment (CKA) module is designed to further align the disentangled new knowledge with the old one in two mutually independent inter- and intra-modality feature spaces based on dual-modality prototypes in a balanced manner. Extensive experiments on four benchmark datasets verify the effectiveness and superiority of our CKDA against state-of-the-art methods. The source code of this paper is available at https://github.com/PKU-ICST-MIPL/CKDA-AAAI2026.
Problem

Research questions and friction points this paper is trying to address.

Addresses catastrophic forgetting in lifelong visible-infrared person re-identification
Solves modality-specific and common knowledge interference during sequential training
Balances cross-modal knowledge preservation to prevent collaborative forgetting
Innovation

Methods, ideas, or system contributions that make the work stand out.

Disentangles modality-specific and common knowledge
Aligns cross-modal knowledge using dual prototypes
Uses prompting modules to avoid knowledge interference
🔎 Similar Papers
No similar papers found.