MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration

📅 2022-02-09

🏛️ The Web Conference

📈 Citations: 13

✨ Influential: 1

🤖 AI Summary

Existing binning-based calibration methods rely solely on model predictions while ignoring input features, and lack personalization capabilities—thus failing to jointly optimize calibration accuracy and ranking preservation. This paper proposes MBCT, a feature-aware personalized binning framework: it introduces a tree-structured, learnable binning mechanism that explicitly models feature interactions to optimize bin boundaries; incorporates sample-level linear regression at each bin node for individualized calibration; and designs a multi-objective loss function jointly minimizing expected calibration error (ECE) and preserving ranking performance (AUC). Extensive experiments across medical, meteorological, and computational advertising datasets demonstrate that MBCT significantly outperforms state-of-the-art calibration methods. Deployed in a large-scale online advertising platform, MBCT has been validated via A/B testing to substantially improve key business metrics—including click-through rate (CTR) and conversion rate (CVR).

📝 Abstract

Most machine learning classifiers only concern classification accuracy, while certain applications (such as medical diagnosis, meteorological forecasting, and computation advertising) require the model to predict the true probability, known as a calibrated estimate. In previous work, researchers have developed several calibration methods to post-process the outputs of a predictor to obtain calibrated values, such as binning and scaling methods. Compared with scaling, binning methods are shown to have distribution-free theoretical guarantees, which motivates us to prefer binning methods for calibration. However, we notice that existing binning methods have several drawbacks: (a) the binning scheme only considers the original prediction values, thus limiting the calibration performance; and (b) the binning approach is non-individual, mapping multiple samples in a bin to the same value, and thus is not suitable for order-sensitive applications. In this paper, we propose a feature-aware binning framework, called Multiple Boosting Calibration Trees (MBCT), along with a multi-view calibration loss to tackle the above issues. Our MBCT optimizes the binning scheme by the tree structures of features, and adopts a linear function in a tree node to achieve individual calibration. Our MBCT is non-monotonic, and has the potential to improve order accuracy, due to its learnable binning scheme and the individual calibration. We conduct comprehensive experiments on three datasets in different fields. Results show that our method outperforms all competing models in terms of both calibration error and order accuracy. We also conduct simulation experiments, justifying that the proposed multi-view calibration loss is a better metric in modeling calibration error. In addition, our approach is deployed in a real-world online advertising platform; an A/B test over two weeks further demonstrates the effectiveness and great business value of our approach.

Problem

Research questions and friction points this paper is trying to address.

Improves calibration accuracy in machine learning models

Addresses limitations of existing binning methods for calibration

Introduces feature-aware binning for individual uncertainty calibration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Feature-aware binning using tree structures

Individual calibration via linear functions

Multi-view calibration loss for improved accuracy

🔎 Similar Papers

No similar papers found.

Authors to Follow