On prediction-powered inference for quantile regression via convolution smoothing

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This study addresses the computational and inferential challenges of applying predictive augmented inference to quantile regression in data-limited settings characterized by few high-quality labels and abundant proxy labels. The authors introduce convolution smoothing into this framework for the first time, proposing two computable estimators—along with an ensemble approach—by smoothing the check loss function. This strategy effectively mitigates optimization difficulties arising from the non-differentiability of the original objective and substantially reduces over-coverage in confidence intervals. Theoretical analysis establishes the asymptotic distribution under model misspecification, while numerical experiments and an application to housing data demonstrate that the proposed method is computationally efficient, yields accurate inference, and offers both practical utility and superior performance.

📝 Abstract

This paper studies quantile regression in a data-limited setting where the gold-standard outcome is available only for a limited number of observations, whereas a surrogate outcome is widely available. Such settings are becoming increasingly common with the availability of low-cost predictions from modern AI, motivating a growing line of research on "prediction-powered inference," for improved statistical inference. Naively extending this framework to quantile regression, however, raises two challenges: computational difficulties due to the discontinuity of the subgradient, and overly conservative confidence intervals. To address these issues, we propose a convolution-based smoothing of the check-loss objective and develop two variants of the estimator. The proposed estimators are computationally tractable, and our numerical studies show that they mitigate overcoverage. As a theoretical contribution, we establish the asymptotic distributions of the proposed estimators under a possibly misspecified linear quantile regression model. We further propose an ensemble of the two estimators and illustrate the proposed methods through simulations and an application to a local housing dataset.

Problem

Research questions and friction points this paper is trying to address.

quantile regression

prediction-powered inference

data-limited setting

surrogate outcome

confidence intervals

Innovation

Methods, ideas, or system contributions that make the work stand out.

prediction-powered inference

quantile regression

convolution smoothing