🤖 AI Summary
In class-imbalanced semi-supervised learning (CISSL), classifier performance on minority classes degrades due to data bias in pseudo-labeling. Method: This paper proposes a bias-mitigation mechanism based on consistency-induced gradient conflict. It introduces (1) a theoretical proof that the black image serves as the optimal baseline for logit subtraction, grounding logit debiasing; (2) a gradient-direction conflict detection module that identifies samples whose pseudo-label gradients oppose those of debiased logits, converting their bias signals into optimization incentives; and (3) a dynamic pseudo-label refinement strategy integrating consistency regularization and baseline calibration for enhanced robustness. Results: Evaluated on multiple CISSL benchmarks, the method significantly improves minority-class accuracy and consistently outperforms existing debiasing approaches.
📝 Abstract
Classifiers often learn to be biased corresponding to the class-imbalanced dataset, especially under the semi-supervised learning (SSL) set. While previous work tries to appropriately re-balance the classifiers by subtracting a class-irrelevant image's logit, but lacks a firm theoretical basis. We theoretically analyze why exploiting a baseline image can refine pseudo-labels and prove that the black image is the best choice. We also indicated that as the training process deepens, the pseudo-labels before and after refinement become closer. Based on this observation, we propose a debiasing scheme dubbed LCGC, which Learning from Consistency Gradient Conflicting, by encouraging biased class predictions during training. We intentionally update the pseudo-labels whose gradient conflicts with the debiased logits, representing the optimization direction offered by the over-imbalanced classifier predictions. Then, we debiased the predictions by subtracting the baseline image logits during testing. Extensive experiments demonstrate that LCGC can significantly improve the prediction accuracy of existing CISSL models on public benchmarks.