Geodesic Prototype Matching via Diffusion Maps for Interpretable Fine-Grained Recognition

📅 2025-09-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In fine-grained recognition, Euclidean distance fails to capture the nonlinear manifold structure of deep features, leading to inaccurate semantic discrepancy modeling. To address this, we propose a manifold-aware prototype matching framework. Our core innovation is the first integration of diffusion maps with differentiable Nyström interpolation to construct a dynamically aligned manifold prototype space; compact, periodically updated landmark sets enable efficient geometric alignment and scalable prototype learning. The method is seamlessly embedded into deep feature extractors and supports end-to-end differentiable optimization. On CUB-200-2011 and Stanford Cars, it significantly outperforms Euclidean prototype baselines. Learned prototypes precisely localize semantically consistent regions, achieving both improved classification accuracy and strong interpretability.

Technology Category

Application Category

📝 Abstract
Nonlinear manifolds are widespread in deep visual features, where Euclidean distances often fail to capture true similarity. This limitation becomes particularly severe in prototype-based interpretable fine-grained recognition, where subtle semantic distinctions are essential. To address this challenge, we propose a novel paradigm for prototype-based recognition that anchors similarity within the intrinsic geometry of deep features. Specifically, we distill the latent manifold structure of each class into a diffusion space and introduce a differentiable Nyström interpolation, making the geometry accessible to both unseen samples and learnable prototypes. To ensure efficiency, we employ compact per-class landmark sets with periodic updates. This design keeps the embedding aligned with the evolving backbone, enabling fast and scalable inference. Extensive experiments on the CUB-200-2011 and Stanford Cars datasets show that our GeoProto framework produces prototypes focusing on semantically aligned parts, significantly outperforming Euclidean prototype networks.
Problem

Research questions and friction points this paper is trying to address.

Measuring feature similarity on nonlinear manifolds using Euclidean distances
Capturing subtle semantic distinctions in fine-grained recognition tasks
Aligning prototype learning with intrinsic geometry of deep features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion maps to model feature manifolds
Introduces differentiable Nyström interpolation for geometry
Employs compact landmarks with periodic updates
🔎 Similar Papers
No similar papers found.
Junhao Jia
Junhao Jia
Hangzhou Dianzi University
Explainable AI (XAI)Interpretable Computer VisionMedical Image Analysis
Y
Yunyou Liu
Hangzhou Dianzi University, Hangzhou, China
Y
Yifei Sun
Zhejiang University, Hangzhou, China
H
Huangwei Chen
Zhejiang University, Hangzhou, China
Feiwei Qin
Feiwei Qin
Prof. College of Computer Science, Hangzhou Dianzi University
Artificial IntelligenceComputer-Aided DesignComputer VisionMedical Image Analysis
C
Changmiao Wang
Shenzhen Research Institute of Big Data, Shenzhen, China
Y
Yong Peng
Hangzhou Dianzi University, Hangzhou, China