Long-Tailed Recognition via Information-Preservable Two-Stage Learning

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor tail-class recognition performance of deep classification models under long-tailed data distributions, this paper proposes a two-stage information-theoretic learning framework. In the first stage, feature representations are optimized via mutual information maximization to reduce intra-class variance and enhance inter-class separability. In the second stage, an information-aware sample selection strategy is introduced to prioritize strengthening discriminative boundaries for low-frequency classes while preserving overall data distribution integrity. By jointly addressing representation learning and sampling bias, the method mitigates classification boundary shift—a key challenge in long-tailed learning. Extensive experiments on standard long-tailed benchmarks—including CIFAR-10-LT and ImageNet-LT—demonstrate state-of-the-art performance: significant gains in tail-class accuracy without compromising overall accuracy, thereby validating the framework’s effectiveness and robustness.

Technology Category

Application Category

📝 Abstract
The imbalance (or long-tail) is the nature of many real-world data distributions, which often induces the undesirable bias of deep classification models toward frequent classes, resulting in poor performance for tail classes. In this paper, we propose a novel two-stage learning approach to mitigate such a majority-biased tendency while preserving valuable information within datasets. Specifically, the first stage proposes a new representation learning technique from the information theory perspective. This approach is theoretically equivalent to minimizing intra-class distance, yielding an effective and well-separated feature space. The second stage develops a novel sampling strategy that selects mathematically informative instances, able to rectify majority-biased decision boundaries without compromising a model's overall performance. As a result, our approach achieves the state-of-the-art performance across various long-tailed benchmark datasets, validated via extensive experiments. Our code is available at https://github.com/fudong03/BNS_IPDPP.
Problem

Research questions and friction points this paper is trying to address.

Mitigating class imbalance bias in deep learning models
Preserving valuable information in long-tailed datasets
Rectifying majority-biased decision boundaries without performance loss
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage learning approach for long-tailed recognition
Information theory-based representation learning technique
Mathematically informative sampling strategy for decision boundaries
🔎 Similar Papers
No similar papers found.
Fudong Lin
Fudong Lin
Applied Scientist, Amazon
Deep LearningComputer VisionMulti-Modal Learning
X
Xu Yuan
Department of Computer & Information Sciences, University of Delaware, Newark, DE 19711