Local Overidentification and Efficiency Gains in Modern Causal Inference and Data Combination

📅 2025-10-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of nonparametric local (over-)identification in modern causal inference, particularly under concurrent nonparametric endogeneity and local overidentification—where conventional semiparametric efficiency bounds are unavailable and strong identification assumptions are overly restrictive. Method: We employ conditional moment restrictions as a unifying framework to transform structural models into observable statistical models, systematically characterizing inherent local overidentification structures in settings such as negative control and long-term causal models. We relax the classical exact-identification assumption and derive a general semiparametric efficiency bound. We further demonstrate that classical doubly robust estimators are inefficient under overidentification and construct novel estimators achieving this bound. Contribution: Our work establishes the first unified nonparametric causal inference framework accommodating both exact and overidentified settings, substantially extending the applicability of semiparametric efficiency theory to data fusion and complex causal structures.

Technology Category

Application Category

📝 Abstract
This paper studies nonparametric local (over-)identification, in the sense of Chen and Santos (2018), and the associated semiparametric efficiency in modern causal frameworks. We develop a unified approach that begins by translating structural models with latent variables into their induced statistical models of observables and then analyzes local overidentification through conditional moment restrictions. We apply this approach to three leading models: (i) the general treatment model under unconfoundedness, (ii) the negative control model, and (iii) the long-term causal inference model under unobserved confounding. The first design yields a locally just-identified statistical model, implying that all regular asymptotically linear estimators of the treatment effect share the same asymptotic variance, equal to the (trivial) semiparametric efficiency bound. In contrast, the latter two models involve nonparametric endogeneity and are naturally locally overidentified; consequently, some doubly robust orthogonal moment estimators of the average treatment effect are inefficient. Whereas existing work typically imposes strong conditions to restore just-identification before deriving the efficiency bound, we relax such assumptions and characterize the general efficiency bound, along with efficient estimators, in the overidentified models (ii) and (iii).
Problem

Research questions and friction points this paper is trying to address.

Analyzing local overidentification in causal models with latent variables
Developing efficiency bounds for treatment effects under unobserved confounding
Characterizing efficient estimators in overidentified negative control models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Translates latent structural models into observable statistical models
Analyzes local overidentification via conditional moment restrictions
Characterizes efficiency bounds and estimators in overidentified models
🔎 Similar Papers
No similar papers found.
X
Xiaohong Chen
Yale University
Haitian Xie
Haitian Xie
Peking University
Economics