Generation from Noisy Examples

📅 2025-01-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates reliable generation of high-quality novel positive examples by a generator when the positive sample stream is contaminated with a small fraction of negative-label noise—extending noise-agnostic generative learning theory. Method: We introduce the notion of “generability under noise” and establish its necessary and sufficient conditions for binary hypothesis classes; prove intrinsic robustness to finite negative-noise instances for finite or countable hypothesis classes; and jointly characterize hypothesis class complexity, observability of the example stream, and noise-tolerance mechanisms via learning-theoretic analysis to precisely delineate the feasibility boundary of generation under noise. Contribution/Results: Our key contribution is uncovering the fundamental trade-off between hypothesis class size and noise robustness, providing a rigorous theoretical foundation and discriminative criteria for controllable generation from noisy data. The results yield principled guarantees for generative models operating in realistic, label-contaminated settings.

Technology Category

Application Category

📝 Abstract
We continue to study the learning-theoretic foundations of generation by extending the results from Kleinberg and Mullainathan [2024] and Li et al. [2024] to account for noisy example streams. In the noiseless setting of Kleinberg and Mullainathan [2024] and Li et al. [2024], an adversary picks a hypothesis from a binary hypothesis class and provides a generator with a sequence of its positive examples. The goal of the generator is to eventually output new, unseen positive examples. In the noisy setting, an adversary still picks a hypothesis and a sequence of its positive examples. But, before presenting the stream to the generator, the adversary inserts a finite number of negative examples. Unaware of which examples are noisy, the goal of the generator is to still eventually output new, unseen positive examples. In this paper, we provide necessary and sufficient conditions for when a binary hypothesis class can be noisily generatable. We provide such conditions with respect to various constraints on the number of distinct examples that need to be seen before perfect generation of positive examples. Interestingly, for finite and countable classes we show that generatability is largely unaffected by the presence of a finite number of noisy examples.
Problem

Research questions and friction points this paper is trying to address.

Noisy datasets
Adversarial samples
High-quality sample generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Anomaly Robustness
Sample Generation
Imperfect Data Learning
🔎 Similar Papers
No similar papers found.