Active learning from positive and unlabeled examples

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This work addresses the problem of active positive-unlabeled (PU) learning under a weakly supervised setting where only a subset of positive instances are labeled, and the rest of the data are unlabeled. The learner can query unlabeled samples, but receives label feedback only if the queried instance is truly positive and an independent random trial succeeds—a probabilistic labeling mechanism. For this setting, the paper presents the first theoretical analysis of label complexity in active PU learning, establishing both upper and lower bounds and thereby filling a critical gap in the theoretical understanding of this paradigm. By integrating an active querying strategy with a PU learning model, the proposed approach significantly improves query efficiency, offering both theoretical guarantees and practical algorithmic guidance for real-world applications such as online advertising and anomaly detection.

Technology Category

Application Category

📝 Abstract

Learning from positive and unlabeled data (PU learning) is a weakly supervised variant of binary classification in which the learner receives labels only for (some) positively labeled instances, while all other examples remain unlabeled. Motivated by applications such as advertising and anomaly detection, we study an active PU learning setting where the learner can adaptively query instances from an unlabeled pool, but a queried label is revealed only when the instance is positive and an independent coin flip succeeds; otherwise the learner receives no information. In this paper, we provide the first theoretical analysis of the label complexity of active PU learning.

Problem

Research questions and friction points this paper is trying to address.

active learning

positive and unlabeled learning

label complexity

weakly supervised learning

binary classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

active learning

positive and unlabeled learning

label complexity