Beyond Rebalancing: Benchmarking Binary Classifiers Under Class Imbalance Without Rebalancing Techniques

📅 2025-09-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In critical domains such as medical diagnosis, standard binary classifier evaluation under severe class imbalance often fails to reflect real-world robustness, especially when rebalancing techniques are inadmissible. Method: We propose a rebalancing-free robustness evaluation framework that synthesizes complex decision boundaries and adopts few-shot minority-class settings to emulate realistic extreme imbalance. We systematically benchmark TabPFN, ensemble boosting, one-class classification (OCC), and classical sampling methods across multiple real-world and synthetic datasets. Results: Traditional models exhibit significant performance degradation as minority-class prevalence decreases and data complexity increases; in contrast, TabPFN and ensemble methods demonstrate superior generalization and stability. This work is the first to reveal intrinsic robustness disparities among diverse models under unrebalanced conditions within a unified evaluation framework, establishing a new benchmark for imbalanced learning and offering actionable insights for practical deployment.

Technology Category

Application Category

📝 Abstract

Class imbalance poses a significant challenge to supervised classification, particularly in critical domains like medical diagnostics and anomaly detection where minority class instances are rare. While numerous studies have explored rebalancing techniques to address this issue, less attention has been given to evaluating the performance of binary classifiers under imbalance when no such techniques are applied. Therefore, the goal of this study is to assess the performance of binary classifiers "as-is", without performing any explicit rebalancing. Specifically, we systematically evaluate the robustness of a diverse set of binary classifiers across both real-world and synthetic datasets, under progressively reduced minority class sizes, using one-shot and few-shot scenarios as baselines. Our approach also explores varying data complexities through synthetic decision boundary generation to simulate real-world conditions. In addition to standard classifiers, we include experiments using undersampling, oversampling strategies, and one-class classification (OCC) methods to examine their behavior under severe imbalance. The results confirm that classification becomes more difficult as data complexity increases and the minority class size decreases. While traditional classifiers deteriorate under extreme imbalance, advanced models like TabPFN and boosting-based ensembles retain relatively higher performance and better generalization compared to traditional classifiers. Visual interpretability and evaluation metrics further validate these findings. Our work offers valuable guidance on model selection for imbalanced learning, providing insights into classifier robustness without dependence on explicit rebalancing techniques.

Problem

Research questions and friction points this paper is trying to address.

Evaluating binary classifiers without rebalancing under class imbalance

Assessing classifier robustness with reduced minority class sizes

Exploring performance across varying data complexities and imbalance scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluating classifiers without rebalancing techniques

Testing robustness under reduced minority class sizes

Using synthetic decision boundaries for data complexity

🔎 Similar Papers

No similar papers found.

Authors to Follow