Optimally taming biases in black-box models for efficient semiparametric estimation

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

210K/year
🤖 AI Summary
This work addresses the suboptimal inference in semiparametric estimation caused by estimation errors in nuisance functions when using black-box machine learning models. The authors propose a novel estimator that, without imposing additional assumptions, eliminates first-order stochastic errors from nuisance estimation and achieves optimal convergence rates even when auxiliary functions cannot be consistently estimated. Built upon the framework of orthogonal scores and semiparametric linear functionals, the proposed estimator attains the sharp rate \(n^{-1/2} + \delta^a_\mu + (\delta^s_\mu)^2\) and is shown to be asymptotically normal with minimal asymptotic variance. Its tuning strategy favors undersmoothing and substantially outperforms classical double machine learning methods, making it well-suited for widespread applications such as average treatment effect estimation.
📝 Abstract
Modern semiparametric estimation often relies on flexible black-box machine learning methods to estimate nuisance functions, raising a fundamental question: how do nuisance estimation errors propagate into inference for low-dimensional target parameters? The dominant paradigm, exemplified by double machine learning (DML), yields error bounds in which nuisance estimation errors enter multiplicatively. While widely adopted, it remains unclear whether this multiplicative-rate dependence is optimal for black-box models. In this paper, we start by revisiting the partial linear model $Y = μ_0(X)+T\cdotβ_0+\varepsilon$ under a structure-agnostic setting, where the nuisance function $μ_0$ is estimated using a generic machine learning model, with approximation error $δ^a_μ$ and stochastic error $δ_μ^s$. We show that the standard DML rate is not optimal in the regime where the auxiliary function $\mathbb{E}[T|X=x]$ cannot be consistently estimated. We propose a new estimator for $β_0$ that achieves a sharper rate of $n^{-1/2}+δ^a_μ+(δ_μ^s)^2$ and establish a matching lower bound demonstrating its optimality. Our results reveal a new principle: the first-order stochastic error of nuisance estimation can be eliminated without imposing any additional assumptions. This also leads to a revised tuning strategy favoring under-smoothing, where $δ^a_μ\asymp(δ_μ^s)^2$, rather than the classical bias-variance trade-off $δ^a_μ\asymp δ_μ^s$. Under mild additional conditions, the estimator is asymptotically normal with minimal asymptotic variance. The proposed method extends to a broad class of semi-parametric linear functional estimation problems, including average treatment effect estimation. Our results imply that popular orthogonal score methods in semiparametric estimation with black-box nuisance learners can be substantially improved.
Problem

Research questions and friction points this paper is trying to address.

semiparametric estimation
nuisance estimation error
black-box models
double machine learning
bias-variance trade-off
Innovation

Methods, ideas, or system contributions that make the work stand out.

semiparametric estimation
double machine learning
nuisance function
optimal rate
orthogonal score
🔎 Similar Papers
2024-08-10AAAI Conference on Artificial IntelligenceCitations: 0