Following the Clues: Experiments on Person Re-ID using Cross-Modal Intelligence

📅 2025-07-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Street-scene person re-identification (Re-ID) faces dual challenges of privacy leakage—particularly from non-facial personally identifiable information (PII) in open datasets—and limited cross-domain robustness. Method: We propose cRID, the first systematic framework leveraging semantic cues to detect non-facial PII in images. It integrates vision-language models to extract text-describable sensitive semantic features, employs a graph attention network to model fine-grained PII correlations, and applies interpretable representation learning for privacy-aware feature disentanglement. Contribution/Results: Evaluated on cross-dataset benchmarks (e.g., Market-1501 → CUHK03-np [detected]), cRID achieves significant mAP improvements over baselines, demonstrating its effectiveness in enhancing Re-ID generalization while rigorously preserving privacy. The framework offers both theoretical insight into semantic PII modeling and practical utility for privacy-compliant surveillance systems.

Technology Category

Application Category

📝 Abstract
The collection and release of street-level recordings as Open Data play a vital role in advancing autonomous driving systems and AI research. However, these datasets pose significant privacy risks, particularly for pedestrians, due to the presence of Personally Identifiable Information (PII) that extends beyond biometric traits such as faces. In this paper, we present cRID, a novel cross-modal framework combining Large Vision-Language Models, Graph Attention Networks, and representation learning to detect textual describable clues of PII and enhance person re-identification (Re-ID). Our approach focuses on identifying and leveraging interpretable features, enabling the detection of semantically meaningful PII beyond low-level appearance cues. We conduct a systematic evaluation of PII presence in person image datasets. Our experiments show improved performance in practical cross-dataset Re-ID scenarios, notably from Market-1501 to CUHK03-np (detected), highlighting the framework's practical utility. Code is available at https://github.com/RAufschlaeger/cRID.
Problem

Research questions and friction points this paper is trying to address.

Detect textual describable PII clues in street-level recordings
Enhance person re-identification using cross-modal intelligence
Address privacy risks in open datasets for autonomous driving
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines Vision-Language Models and Graph Networks
Detects textual describable PII clues
Enhances cross-dataset person Re-ID performance
🔎 Similar Papers
No similar papers found.
R
Robert Aufschläger
Deggendorf Institute of Technology, Deggendorf, Germany
Youssef Shoeb
Youssef Shoeb
Continental AG, TU Berlin
Machine LearningComputer Vision
A
Azarm Nowzad
Continental AG, Berlin, Germany
M
Michael Heigl
Deggendorf Institute of Technology, Deggendorf, Germany
F
Fabian Bally
Deggendorf Institute of Technology, Deggendorf, Germany
M
Martin Schramm
Deggendorf Institute of Technology, Deggendorf, Germany