Prediction-Powered Semi-Supervised Learning with Online Power Tuning

📅 2025-10-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In semi-supervised learning (SSL), low-quality pseudo-labels introduce estimation bias, and existing methods struggle to dynamically balance gradient contributions from labeled and unlabeled data. This paper proposes a prediction-augmented inference framework that constructs an unbiased gradient estimator by explicitly modeling pseudo-label errors as a learnable bias term. A one-dimensional online learning algorithm jointly optimizes interpolation weights and model parameters, enabling real-time calibration of pseudo-label contributions during training. Unlike offline hyperparameter tuning or fixed-weight schemes, our approach eliminates reliance on manual weight selection and adapts continuously to label noise. Experiments on synthetic and multiple real-world benchmark datasets demonstrate that the method significantly outperforms classical SSL approaches—including UDA and FixMatch—as well as offline-weight-optimized PPI variants, achieving substantial gains in classification accuracy while robustly mitigating performance degradation induced by pseudo-label noise.

Technology Category

Application Category

📝 Abstract
Prediction-Powered Inference (PPI) is a recently proposed statistical inference technique for parameter estimation that leverages pseudo-labels on both labeled and unlabeled data to construct an unbiased, low-variance estimator. In this work, we extend its core idea to semi-supervised learning (SSL) for model training, introducing a novel unbiased gradient estimator. This extension addresses a key challenge in SSL: while unlabeled data can improve model performance, its benefit heavily depends on the quality of pseudo-labels. Inaccurate pseudo-labels can introduce bias, leading to suboptimal models.To balance the contributions of labeled and pseudo-labeled data, we utilize an interpolation parameter and tune it on the fly, alongside the model parameters, using a one-dimensional online learning algorithm. We verify the practical advantage of our approach through experiments on both synthetic and real datasets, demonstrating improved performance over classic SSL baselines and PPI methods that tune the interpolation parameter offline.
Problem

Research questions and friction points this paper is trying to address.

Extends PPI to semi-supervised learning for unbiased gradient estimation
Addresses bias from inaccurate pseudo-labels in semi-supervised learning
Tunes interpolation parameter online to balance labeled and pseudo-labeled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses unbiased gradient estimator for semi-supervised learning
Tunes interpolation parameter online with one-dimensional algorithm
Balances labeled and pseudo-labeled data contributions dynamically
🔎 Similar Papers
No similar papers found.
N
Noa Shoham
Department of Electrical and Computer Engineering, Technion IIT
Ron Dorfman
Ron Dorfman
PhD Student, Technion - Israel Institute of Technology
Machine LearningStochastic Optimization
S
Shalev Shaer
Department of Electrical and Computer Engineering, Technion IIT
K
Kfir Y. Levy
Department of Electrical and Computer Engineering, Technion IIT
Yaniv Romano
Yaniv Romano
Associate Professor of Electrical Engineering and Computer Science, Technion, Israel
Machine LearningStatisticsInverse Problems