Average partial effect estimation using double machine learning

📅 2023-08-17
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses robust estimation of average partial effects (APEs) in nonlinear models under moderate-dimensional settings. We propose a novel double machine learning framework that dispenses with linearity assumptions and differentiability requirements on the regression model, permitting arbitrary black-box machine learning algorithms as first-stage estimators. Our method innovatively introduces re-smoothing to confer differentiability upon otherwise non-differentiable estimators; integrates a location-scale model to flexibly characterize the conditional distribution of covariates; and constructs a doubly robust semiparametric inference procedure. We establish theoretical guarantees: the estimator achieves the semiparametric efficiency bound and remains robust under model misspecification and other nonstandard conditions. Numerical experiments demonstrate substantial improvements over existing APE estimators in both estimation accuracy and confidence interval coverage.
📝 Abstract
Single-parameter summaries of variable effects are desirable for ease of interpretation, but linear models, which would deliver these, may fit poorly to the data. A modern approach is to estimate the average partial effect -- the average slope of the regression function with respect to the predictor of interest -- using a doubly robust semiparametric procedure. Most existing work has focused on specific forms of nuisance function estimators. We extend the scope to arbitrary plug-in nuisance function estimation, allowing for the use of modern machine learning methods which in particular may deliver non-differentiable regression function estimates. Our procedure involves resmoothing a user-chosen first-stage regression estimator to produce a differentiable version, and modelling the conditional distribution of the predictors through a location-scale model. We show that our proposals lead to a semiparametric efficient estimator under relatively weak assumptions. Our theory makes use of a new result on the sub-Gaussianity of Lipschitz score functions that may be of independent interest. We demonstrate the attractive numerical performance of our approach in a variety of settings including ones with misspecification.
Problem

Research questions and friction points this paper is trying to address.

Estimates average partial effects interpretably in regression
Overcomes non-differentiability in machine learning methods
Controls errors via mean and standard deviation estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Double machine learning for effect estimation
Resmoothing non-differentiable regression estimators
Location-scale model for conditional distribution
🔎 Similar Papers
No similar papers found.
H
Harvey Klyne
Statistical Laboratory, University of Cambridge, United Kingdom
Rajen D. Shah
Rajen D. Shah
University of Cambridge
StatisticsMachine Learning