🤖 AI Summary
Heterogeneous treatment effect (HTE) models often lack interpretability, hindering clinical adoption and subgroup validation.
Method: We propose Causal Distillation Trees (CDT), a two-stage framework that first leverages any black-box HTE estimator (e.g., XGBoost or random forests) to obtain individualized treatment effect predictions, then distills these predictions into an interpretable decision tree to identify statistically stable and clinically meaningful response subgroups.
Contribution/Results: CDT is the first method to decouple black-box predictive power from tree-based interpretability. It introduces a stability-driven subgroup quality diagnostic framework with theoretical guarantees on consistency of subgroup estimation, and incorporates stability-aware pruning and evaluation. Experiments on the ACTG 175 HIV clinical trial dataset demonstrate that CDT significantly improves both subgroup stability and clinical relevance, outperforming state-of-the-art baselines.
📝 Abstract
Recent methodological developments have introduced new black-box approaches to better estimate heterogeneous treatment effects; however, these methods fall short of providing interpretable characterizations of the underlying individuals who may be most at risk or benefit most from receiving the treatment, thereby limiting their practical utility. In this work, we introduce causal distillation trees (CDT) to estimate interpretable subgroups. CDT allows researchers to fit any machine learning model to estimate the individual-level treatment effect, and then leverages a simple, second-stage tree-based model to"distill"the estimated treatment effect into meaningful subgroups. As a result, CDT inherits the improvements in predictive performance from black-box machine learning models while preserving the interpretability of a simple decision tree. We derive theoretical guarantees for the consistency of the estimated subgroups using CDT, and introduce stability-driven diagnostics for researchers to evaluate the quality of the estimated subgroups. We illustrate our proposed method on a randomized controlled trial of antiretroviral treatment for HIV from the AIDS Clinical Trials Group Study 175 and show that CDT out-performs state-of-the-art approaches in constructing stable, clinically relevant subgroups.