Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled Learning and Conditional Generation with Extra Data

📅 2020-06-14
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
To address the underutilization of unlabeled data—particularly out-of-distribution (OOD) samples—under label scarcity, this paper proposes Classifier-Noise-Invariant CGAN (CNI-CGAN), the first method enabling robust learning of the true data distribution from noisy pseudo-labels generated by a positive-unlabeled (PU) classifier. Methodologically, CNI-CGAN establishes a bidirectional, jointly optimized framework integrating classification and conditional generation: the PU classifier provides pseudo-supervision, while the CGAN refines the classification decision boundary in a noise-robust manner. We theoretically prove its optimality and formalize a mutually beneficial training paradigm between classification and generation. Extensive experiments on multiple benchmark datasets demonstrate that CNI-CGAN simultaneously improves both PU classification accuracy and conditional generation fidelity, significantly outperforming single-task baselines as well as state-of-the-art PU learning and semi-supervised generative methods.
📝 Abstract
The scarcity of class-labeled data is a ubiquitous bottleneck in a wide range of machine learning problems. While abundant unlabeled data normally exist and provide a potential solution, it is extremely challenging to exploit them. In this paper, we address this problem by leveraging Positive-Unlabeled~(PU) classification and conditional generation with extra unlabeled data emph{simultaneously}, both of which aim to make full use of agnostic unlabeled data to improve classification and generation performances. In particular, we present a novel training framework to jointly target both PU classification and conditional generation when exposing to extra data, especially out-of-distribution unlabeled data, by exploring the interplay between them: 1) enhancing the performance of PU classifiers with the assistance of a novel Conditional Generative Adversarial Network~(CGAN) that is robust to noisy labels, 2) leveraging extra data with predicted labels from a PU classifier to help the generation. Our key contribution is a Classifier-Noise-Invariant Conditional GAN~(CNI-CGAN) that can learn the clean data distribution from noisy labels predicted by a PU classifier. Theoretically, we proved the optimal condition of CNI-CGAN and experimentally, we conducted extensive evaluations on diverse datasets, verifying the simultaneous improvements on both classification and generation.
Problem

Research questions and friction points this paper is trying to address.

Addressing scarcity of labeled data using unlabeled data
Simultaneously improving PU classification and conditional generation
Enhancing robustness with noise-invariant GAN and extra data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simultaneous PU classification and conditional generation
Classifier-Noise-Invariant Conditional GAN
Leveraging extra data with predicted labels
🔎 Similar Papers
No similar papers found.