Boundary-Aware Adversarial Filtering for Reliable Diagnosis under Extreme Class Imbalance

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of simultaneously optimizing recall and prediction calibration under extreme class imbalance—particularly in medical diagnosis—this paper proposes AF-SMOTE, a novel data augmentation framework integrating an adversarial discriminator with a boundary utility model. Building upon SMOTE-generated minority-class samples, AF-SMOTE introduces a boundary-aware adversarial filtering mechanism, theoretically proven to monotonically optimize the F<sub>β</sub> score without compromising calibration. Leveraging assumptions of decision boundary smoothness and class-conditional density, it enhances the reliability of rare-case detection. Empirical evaluation on multiple benchmarks—including MIMIC-IV and fraud detection datasets—demonstrates that AF-SMOTE consistently outperforms mainstream methods (e.g., SMOTE, ADASYN) in recall, average precision, and calibration metrics (ECE and Brier score). Moreover, it exhibits strong generalization across diverse disease domains, underscoring its robustness and clinical applicability.

Technology Category

Application Category

📝 Abstract
We study classification under extreme class imbalance where recall and calibration are both critical, for example in medical diagnosis scenarios. We propose AF-SMOTE, a mathematically motivated augmentation framework that first synthesizes minority points and then filters them by an adversarial discriminator and a boundary utility model. We prove that, under mild assumptions on the decision boundary smoothness and class-conditional densities, our filtering step monotonically improves a surrogate of F_beta (for beta >= 1) while not inflating Brier score. On MIMIC-IV proxy label prediction and canonical fraud detection benchmarks, AF-SMOTE attains higher recall and average precision than strong oversampling baselines (SMOTE, ADASYN, Borderline-SMOTE, SVM-SMOTE), and yields the best calibration. We further validate these gains across multiple additional datasets beyond MIMIC-IV. Our successful application of AF-SMOTE to a healthcare dataset using a proxy label demonstrates in a disease-agnostic way its practical value in clinical situations, where missing true positive cases in rare diseases can have severe consequences.
Problem

Research questions and friction points this paper is trying to address.

Addressing classification under extreme class imbalance with recall and calibration requirements
Proposing adversarial filtering to improve F_beta surrogate without inflating Brier score
Validating framework on medical diagnosis and fraud detection with improved performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthesizes minority points via augmentation framework
Filters points using adversarial discriminator model
Employs boundary utility model to improve classification
🔎 Similar Papers
No similar papers found.
Y
Yanxuan Yu
Columbia University, USA; Columbia University Irving Medical Center, USA
M
Michael S. Hughes
Columbia University, USA; Columbia University Irving Medical Center, USA
J
Julien Lee
Columbia University, USA; Columbia University Irving Medical Center, USA
J
Jiacheng Zhou
Columbia University, USA; Columbia University Irving Medical Center, USA
Andrew F. Laine
Andrew F. Laine
Columbia University
biomedical imagingimage analysisdeep learningbiomedical informaticsmachine learning