🤖 AI Summary
In multi-positive–unlabeled (MPU) learning, the absence of reliable negative examples leads to biased risk estimation. Method: This paper proposes an adaptive cost-sensitive approach that formalizes the MPU data generation mechanism and designs a data-dependent dynamic loss weighting scheme, enabling the first unbiased empirical estimation of the target risk. It theoretically derives a generalization error bound to ensure model robustness and jointly optimizes the positive-class loss and the inferred negative-class loss—derived from unlabeled data—within an empirical risk minimization framework. Contribution/Results: Extensive experiments on eight public benchmarks demonstrate that the method consistently outperforms strong baselines across varying class priors and numbers of classes, achieving an average accuracy improvement of 2.1%. Moreover, training exhibits enhanced stability, significantly improving the practicality and reliability of MPU learning in real-world applications.
📝 Abstract
Positive--Unlabeled (PU) learning considers settings in which only positive and unlabeled data are available, while negatives are missing or left unlabeled. This situation is common in real applications where annotating reliable negatives is difficult or costly. Despite substantial progress in PU learning, the multi-class case (MPU) remains challenging: many existing approaches do not ensure emph{unbiased risk estimation}, which limits performance and stability. We propose a cost-sensitive multi-class PU method based on emph{adaptive loss weighting}. Within the empirical risk minimization framework, we assign distinct, data-dependent weights to the positive and emph{inferred-negative} (from the unlabeled mixture) loss components so that the resulting empirical objective is an unbiased estimator of the target risk. We formalize the MPU data-generating process and establish a generalization error bound for the proposed estimator. Extensive experiments on extbf{eight} public datasets, spanning varying class priors and numbers of classes, show consistent gains over strong baselines in both accuracy and stability.