Prediction-Powered Semi-Supervised Learning with Online Power Tuning

📅 2025-10-26

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

In semi-supervised learning (SSL), low-quality pseudo-labels introduce estimation bias, and existing methods struggle to dynamically balance gradient contributions from labeled and unlabeled data. This paper proposes a prediction-augmented inference framework that constructs an unbiased gradient estimator by explicitly modeling pseudo-label errors as a learnable bias term. A one-dimensional online learning algorithm jointly optimizes interpolation weights and model parameters, enabling real-time calibration of pseudo-label contributions during training. Unlike offline hyperparameter tuning or fixed-weight schemes, our approach eliminates reliance on manual weight selection and adapts continuously to label noise. Experiments on synthetic and multiple real-world benchmark datasets demonstrate that the method significantly outperforms classical SSL approaches—including UDA and FixMatch—as well as offline-weight-optimized PPI variants, achieving substantial gains in classification accuracy while robustly mitigating performance degradation induced by pseudo-label noise.

Technology Category

Application Category

📝 Abstract

Prediction-Powered Inference (PPI) is a recently proposed statistical inference technique for parameter estimation that leverages pseudo-labels on both labeled and unlabeled data to construct an unbiased, low-variance estimator. In this work, we extend its core idea to semi-supervised learning (SSL) for model training, introducing a novel unbiased gradient estimator. This extension addresses a key challenge in SSL: while unlabeled data can improve model performance, its benefit heavily depends on the quality of pseudo-labels. Inaccurate pseudo-labels can introduce bias, leading to suboptimal models.To balance the contributions of labeled and pseudo-labeled data, we utilize an interpolation parameter and tune it on the fly, alongside the model parameters, using a one-dimensional online learning algorithm. We verify the practical advantage of our approach through experiments on both synthetic and real datasets, demonstrating improved performance over classic SSL baselines and PPI methods that tune the interpolation parameter offline.

Problem

Research questions and friction points this paper is trying to address.

Extends PPI to semi-supervised learning for unbiased gradient estimation

Addresses bias from inaccurate pseudo-labels in semi-supervised learning

Tunes interpolation parameter online to balance labeled and pseudo-labeled data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses unbiased gradient estimator for semi-supervised learning

Tunes interpolation parameter online with one-dimensional algorithm

Balances labeled and pseudo-labeled data contributions dynamically

🔎 Similar Papers

No similar papers found.