Exact Reformulation and Optimization for Direct Metric Optimization in Binary Imbalanced Classification

📅 2025-07-21

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

In binary imbalanced classification, accuracy is misleading, and existing methods struggle to simultaneously accommodate class importance disparities and satisfy specific metric constraints. This paper directly optimizes precision and recall—addressing three practical scenarios: maximizing recall under a fixed-precision constraint (FPOR), maximizing precision under a fixed-recall constraint (FROP), and optimizing the Fβ-score (OFBS). We propose, for the first time, a unified framework based on exact constraint reformulation, circumventing smooth approximations of nonsmooth metric functions. Our approach employs constraint rewriting and exact penalty methods to enable differentiable optimization. Theoretically rigorous and highly extensible, it preserves the original metric semantics without approximation bias. Extensive experiments on multiple benchmark datasets demonstrate that our method consistently outperforms state-of-the-art approaches across all three tasks, validating its effectiveness and practical utility.

Technology Category

Application Category

📝 Abstract

For classification with imbalanced class frequencies, i.e., imbalanced classification (IC), standard accuracy is known to be misleading as a performance measure. While most existing methods for IC resort to optimizing balanced accuracy (i.e., the average of class-wise recalls), they fall short in scenarios where the significance of classes varies or certain metrics should reach prescribed levels. In this paper, we study two key classification metrics, precision and recall, under three practical binary IC settings: fix precision optimize recall (FPOR), fix recall optimize precision (FROP), and optimize $F_β$-score (OFBS). Unlike existing methods that rely on smooth approximations to deal with the indicator function involved, extit{we introduce, for the first time, exact constrained reformulations for these direct metric optimization (DMO) problems}, which can be effectively solved by exact penalty methods. Experiment results on multiple benchmark datasets demonstrate the practical superiority of our approach over the state-of-the-art methods for the three DMO problems. We also expect our exact reformulation and optimization (ERO) framework to be applicable to a wide range of DMO problems for binary IC and beyond. Our code is available at https://github.com/sun-umn/DMO.

Problem

Research questions and friction points this paper is trying to address.

Optimize recall with fixed precision in imbalanced classification

Optimize precision with fixed recall in imbalanced classification

Optimize Fβ-score directly in binary imbalanced classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exact constrained reformulations for DMO problems

Effective solution via exact penalty methods

ERO framework for binary IC and beyond

🔎 Similar Papers

Group&Reweight: A Novel Cost-Sensitive Approach to Mitigating Class Imbalance in Network Traffic Classification

2024-09-28Citations: 0

Instacart

CA, NY, CT, NJ$240,000—$253,500 USDWA$230,000—$243,000 USDOR, DE, ME, MA, MD, NH, RI, VT, DC, PA, VA, CO, TX, IL, HI$221,000—$233,000 USDAll other states$201,000—$212,000 USD

remote

Research Scientist Intern, Optimization, Privacy and Inference (PhD)