Cost-Sensitive Unbiased Risk Estimation for Multi-Class Positive-Unlabeled Learning

📅 2025-10-29

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

In multi-positive–unlabeled (MPU) learning, the absence of reliable negative examples leads to biased risk estimation. Method: This paper proposes an adaptive cost-sensitive approach that formalizes the MPU data generation mechanism and designs a data-dependent dynamic loss weighting scheme, enabling the first unbiased empirical estimation of the target risk. It theoretically derives a generalization error bound to ensure model robustness and jointly optimizes the positive-class loss and the inferred negative-class loss—derived from unlabeled data—within an empirical risk minimization framework. Contribution/Results: Extensive experiments on eight public benchmarks demonstrate that the method consistently outperforms strong baselines across varying class priors and numbers of classes, achieving an average accuracy improvement of 2.1%. Moreover, training exhibits enhanced stability, significantly improving the practicality and reliability of MPU learning in real-world applications.

Technology Category

Application Category

📝 Abstract

Positive--Unlabeled (PU) learning considers settings in which only positive and unlabeled data are available, while negatives are missing or left unlabeled. This situation is common in real applications where annotating reliable negatives is difficult or costly. Despite substantial progress in PU learning, the multi-class case (MPU) remains challenging: many existing approaches do not ensure emph{unbiased risk estimation}, which limits performance and stability. We propose a cost-sensitive multi-class PU method based on emph{adaptive loss weighting}. Within the empirical risk minimization framework, we assign distinct, data-dependent weights to the positive and emph{inferred-negative} (from the unlabeled mixture) loss components so that the resulting empirical objective is an unbiased estimator of the target risk. We formalize the MPU data-generating process and establish a generalization error bound for the proposed estimator. Extensive experiments on extbf{eight} public datasets, spanning varying class priors and numbers of classes, show consistent gains over strong baselines in both accuracy and stability.

Problem

Research questions and friction points this paper is trying to address.

Addresses multi-class positive-unlabeled learning with missing negative labels

Ensures unbiased risk estimation through adaptive loss weighting

Improves accuracy and stability across diverse datasets and class priors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cost-sensitive unbiased risk estimation for multi-class PU learning

Adaptive loss weighting for positive and inferred-negative components

Empirical risk minimization with data-dependent loss weights

🔎 Similar Papers

No similar papers found.