iCD: A Implicit Clustering Distillation Mathod for Structural Information Mining

📅 2025-09-15

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Logit-based knowledge distillation, while computationally efficient, suffers from poor interpretability and reliance on ground-truth labels or intermediate feature alignment. To address these limitations, we propose Implicit Clustering Distillation (iCD), the first method to uncover implicit clustering structure directly from decoupled local logit representations. iCD models semantic correlations among logits via a Gram matrix, enabling structured knowledge transfer without label supervision or explicit feature-level alignment. This approach enhances both the student model’s capacity to capture latent semantic structures and the interpretability of its decision-making process. Extensive experiments across multiple benchmark datasets demonstrate that iCD consistently outperforms state-of-the-art baselines, achieving up to a 5.08% absolute accuracy gain—particularly pronounced in fine-grained classification tasks where semantic granularity is critical.

Technology Category

Application Category

📝 Abstract

Logit Knowledge Distillation has gained substantial research interest in recent years due to its simplicity and lack of requirement for intermediate feature alignment; however, it suffers from limited interpretability in its decision-making process. To address this, we propose implicit Clustering Distillation (iCD): a simple and effective method that mines and transfers interpretable structural knowledge from logits, without requiring ground-truth labels or feature-space alignment. iCD leverages Gram matrices over decoupled local logit representations to enable student models to learn latent semantic structural patterns. Extensive experiments on benchmark datasets demonstrate the effectiveness of iCD across diverse teacher-student architectures, with particularly strong performance in fine-grained classification tasks -- achieving a peak improvement of +5.08% over the baseline. The code is available at: https://github.com/maomaochongaa/iCD.

Problem

Research questions and friction points this paper is trying to address.

Improving interpretability in logit knowledge distillation

Mining structural knowledge without labels or feature alignment

Enhancing student model learning of semantic patterns

Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit Clustering Distillation for structural mining

Uses Gram matrices on local logits

Transfers semantic patterns without feature alignment

🔎 Similar Papers

No similar papers found.

Authors to Follow