Causal inference amid missingness-specific independencies and mechanism shifts

📅 2025-06-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Conventional causal identification fails under missingness mechanisms that induce structural shifts—where the missingness pattern directly influences substantive variables—violating the strong assumption in standard $m$-graphs that missingness mechanisms do not causally affect outcomes. Method: We propose labeled structural causal models (lm-SCMs) and lm-graphs, introducing label sets to encode context-specific independence (CSI) induced by missingness. We define two newly identifiable causal effects: the Full-sample Average Treatment Effect (FATE) and the Non-missing-subgroup Average Treatment Effect (NATE), along with their identifiability criteria. Our approach integrates structural causal modeling, graphical inference, doubly robust estimation, and simulation-based validation. Results: Applied to Norwegian ADHD treatment data for children, our framework reveals a negligible causal effect of treatment on exam scores; critically, pretest missingness induces substantial bias in conventional estimates. This demonstrates both real-world interpretability and robustness of the proposed methodology.

Technology Category

Application Category

📝 Abstract

The recovery of causal effects in structural models with missing data often relies on $m$-graphs, which assume that missingness mechanisms do not directly influence substantive variables. Yet, in many real-world settings, missing data can alter decision-making processes, as the absence of key information may affect downstream actions and states. To overcome this limitation, we introduce $lm$-SCMs and $lm$-graphs, which extend $m$-graphs by integrating a label set that represents relevant context-specific independencies (CSI), accounting for mechanism shifts induced by missingness. We define two causal effects within these systems: the Full Average Treatment Effect (FATE), which reflects the effect in a hypothetical scenario had no data been missing, and the Natural Average Treatment Effect (NATE), which captures the effect under the unaltered CSIs in the system. We propose recovery criteria for these queries and present doubly-robust estimators for a graphical model inspired by a real-world application. Simulations highlight key differences between these estimands and estimation methods. Findings from the application case suggest a small effect of ADHD treatment upon test achievement among Norwegian children, with a slight effect shift due to missing pre-tests scores.

Problem

Research questions and friction points this paper is trying to address.

Extends m-graphs to handle missingness-induced mechanism shifts

Defines FATE and NATE for causal effects with missing data

Proposes estimators for ADHD treatment impact amid missing scores

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces lm-SCMs and lm-graphs for missing data

Defines FATE and NATE for causal effects

Proposes doubly-robust estimators for graphical models

🔎 Similar Papers

No similar papers found.

Authors to Follow