Competitive Distillation: A Simple Learning Strategy for Improving Visual Classification

📅 2025-06-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing knowledge distillation methods—such as deep mutual learning and self-distillation—suffer from limited performance gains due to their neglect of dynamic inter-network learning direction coupling during training iterations. To address this, we propose a competitive distillation framework that establishes a multi-network collaborative training architecture. It introduces a dynamic teacher-student role-switching mechanism, wherein each network adaptively assumes the teacher or student role based on its real-time performance, coupled with stochastic perturbation to induce parameter mutations and facilitate global optimization. This approach breaks away from conventional unidirectional distillation paradigms, enhancing feature discriminability and model generalization. Extensive experiments on major visual classification benchmarks—including CIFAR-100 and ImageNet-1K—demonstrate consistent and significant improvements over state-of-the-art distillation methods, validating both effectiveness and cross-dataset generalizability.

Technology Category

Application Category

📝 Abstract
Deep Neural Networks (DNNs) have significantly advanced the field of computer vision. To improve DNN training process, knowledge distillation methods demonstrate their effectiveness in accelerating network training by introducing a fixed learning direction from the teacher network to student networks. In this context, several distillation-based optimization strategies are proposed, e.g., deep mutual learning and self-distillation, as an attempt to achieve generic training performance enhancement through the cooperative training of multiple networks. However, such strategies achieve limited improvements due to the poor understanding of the impact of learning directions among networks across different iterations. In this paper, we propose a novel competitive distillation strategy that allows each network in a group to potentially act as a teacher based on its performance, enhancing the overall learning performance. Competitive distillation organizes a group of networks to perform a shared task and engage in competition, where competitive optimization is proposed to improve the parameter updating process. We further introduce stochastic perturbation in competitive distillation, aiming to motivate networks to induce mutations to achieve better visual representations and global optimum. The experimental results show that competitive distillation achieves promising performance in diverse tasks and datasets.
Problem

Research questions and friction points this paper is trying to address.

Improving DNN training via dynamic teacher-student roles
Enhancing learning directions in iterative network training
Boosting visual representation through competitive optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Competitive distillation enhances learning via performance-based teacher selection
Stochastic perturbation encourages better visual representations
Competitive optimization improves parameter updating process
🔎 Similar Papers
No similar papers found.
Daqian Shi
Daqian Shi
University of Trento, UCL, QMUL
X
Xiaolei Diao
School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK
X
Xu Chen
Digital Environment Research Institute (DERI), Queen Mary University of London, London, UK
Cédric M. John
Cédric M. John
Professor of Data Science for the Environment and Sustainability, Queen Mary University of London
deep learningmachine learningcarbonatesclumped isotopespaleoclimate