Learning control variables and instruments for causal analysis in observational data

📅 2024-07-05
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Estimating causal effects from observational data requires selecting appropriate control and instrumental variables that satisfy causal identification conditions—a challenging task often reliant on strong domain knowledge or ad hoc assumptions. Method: This paper proposes the first end-to-end joint learning framework that automatically identifies valid combinations of control and instrumental variables. Grounded in conditional independence testing, the method integrates nonparametric dependence measures with structural search optimization, ensuring statistical consistency in variable selection under mild regularity conditions. Contribution/Results: Unlike conventional approaches requiring prespecified variable sets or strong prior assumptions, our framework is fully data-driven. In simulations, it achieves significantly higher variable identification accuracy. Empirically, applied to the Job Corps study, its estimated treatment effect closely aligns with results from the randomized controlled trial—demonstrating both validity and robustness in real-world causal inference.

Technology Category

Application Category

📝 Abstract
This study introduces a data-driven, machine learning-based method to detect suitable control variables and instruments for assessing the causal effect of a treatment on an outcome in observational data, if they exist. Our approach tests the joint existence of instruments, which are associated with the treatment but not directly with the outcome (at least conditional on observables), and suitable control variables, conditional on which the treatment is exogenous, and learns the partition of instruments and control variables from the observed data. The detection of sets of instruments and control variables relies on the condition that proper instruments are conditionally independent of the outcome given the treatment and suitable control variables. We establish the consistency of our method for detecting control variables and instruments under certain regularity conditions, investigate the finite sample performance through a simulation study, and provide an empirical application to labor market data from the Job Corps study.
Problem

Research questions and friction points this paper is trying to address.

Detects control variables and instruments for causal analysis in observational data
Tests joint existence of instruments and control variables using machine learning
Learns partition of instruments and control variables from observed data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine learning detects control variables and instruments
Tests joint existence of instruments and control variables
Learns partition of instruments from observed data
🔎 Similar Papers
No similar papers found.