Empirical stratification for treatment effect heterogeneity with post-treatment variables

📅 2026-06-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the challenge of estimating heterogeneous treatment effects when post-treatment variables—such as non-compliance—induce endogenous selection bias if naively conditioned upon. The authors propose a lightweight, assumption-lean empirical stratification framework that predicts latent post-treatment responses using baseline covariates to construct an empirical score, which in turn defines observable subgroups for effect estimation. This approach innovatively bridges empirical stratification with principal strata analysis: it recovers principal causal effects under principal ignorability, yet remains informative even when this assumption fails. To flexibly capture effect heterogeneity, the method introduces a projected ETE (Expected Treatment Effect) curve. Supported by theoretical guarantees and implemented via a semiparametric influence function estimator, the framework demonstrates strong empirical validity and robustness in real-data applications.

📝 Abstract

Post-treatment variables (PVs), such as treatment noncompliance, behavioral responses, intercurrent events, often modify the ultimate treatment effect on the primary outcome. However, existing methods provide limited tools for studying treatment effect heterogeneity with respect to PVs. Conventional heterogeneous treatment effect estimands condition on baseline covariates. However, similarly conditioning on the observed PV can induce endogenous selection bias for the treatment effect estimation. Principal stratification offers a rigorous framework for studying principal causal effects across principal strata, but principal strata are latent and their identification often requires stringent assumptions. This paper develops an assumption-lean empirical stratification framework for characterizing treatment effect heterogeneity with respect to PVs. We define empirical scores using the predicted potential PV responses based on baseline covariates, and use the empirical scores to construct empirically accessible subgroups. The resulting empirical-stratum treatment effects (ETEs) are identifiable under standard causal assumptions. We connect the proposed framework to principal stratification by showing that the average ETE recovers principal causal effects under the principal ignorability assumption, but remains informative under violations of this assumption. We further introduce projected ETE curves and develop efficient influence function-based estimators for the semiparametric inference. We illustrate the proposed framework with two real-world applications.

Problem

Research questions and friction points this paper is trying to address.

treatment effect heterogeneity

post-treatment variables

principal stratification

causal inference

endogenous selection bias

Innovation

Methods, ideas, or system contributions that make the work stand out.

empirical stratification

post-treatment variables

treatment effect heterogeneity