Dynamic Token Selection for Aerial-Ground Person Re-Identification

📅 2024-11-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

203K/year
🤖 AI Summary
To address the weak discriminative feature learning in aerial-ground cross-view person re-identification (AGPReID) caused by large viewpoint discrepancies, illumination variations, and cluttered backgrounds, this paper proposes the Dynamic Token Selection Transformer (DTS-Transformer). Our method constructs local representations via region-based tokenization, introduces a Top-k sparse token selection mechanism to adaptively focus on the most discriminative visual tokens, and models their structural relationships through multi-head self-attention—thereby significantly enhancing cross-domain identity representation robustness. The model is end-to-end trainable and represents the first application of dynamic sparse token selection to AGPReID. On the CARGO benchmark, it achieves a new state-of-the-art mAP, outperforming the second-best method by 1.18%. Ablation studies systematically validate the critical influence of token count, embedding position, and number of attention heads.

Technology Category

Application Category

📝 Abstract
Aerial-Ground Person Re-identification (AGPReID) holds significant practical value but faces unique challenges due to pronounced variations in viewing angles, lighting conditions, and background interference. Traditional methods, often involving a global analysis of the entire image, frequently lead to inefficiencies and susceptibility to irrelevant data. In this paper, we propose a novel Dynamic Token Selective Transformer (DTST) tailored for AGPReID, which dynamically selects pivotal tokens to concentrate on pertinent regions. Specifically, we segment the input image into multiple tokens, with each token representing a unique region or feature within the image. Using a Top-k strategy, we extract the k most significant tokens that contain vital information essential for identity recognition. Subsequently, an attention mechanism is employed to discern interrelations among diverse tokens, thereby enhancing the representation of identity features. Extensive experiments on benchmark datasets showcases the superiority of our method over existing works. Notably, on the CARGO dataset, our proposed method gains 1.18% mAP improvements when compared to the second place. In addition, we comprehensively analyze the impact of different numbers of tokens, token insertion positions, and numbers of heads on model performance.
Problem

Research questions and friction points this paper is trying to address.

AGPReID
Viewpoint Variability
Recognition Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Selection Token Transformer
Aerial-Ground Person Re-Identification
CARGO Dataset
🔎 Similar Papers
No similar papers found.