Treatment response as a latent variable

📅 2025-02-12

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Clinical response exhibits natural variability, impeding accurate discrimination between responders and non-responders—a key bottleneck in causal-driven analysis of response heterogeneity. To address this, we propose the Causal Two-Group (C2G) model, which formalizes treatment response as a latent variable and introduces two novel empirical Bayes approaches: semi-parametric and non-parametric. Under non-identifiability, we define a new estimand and develop an estimation interval strategy with rigorous theoretical guarantees. Integrating causal inference, latent variable modeling, and false discovery rate (FDR) control, C2G ensures strict FDR control while achieving near-optimal statistical power. Applied to cancer immunotherapy data, C2G successfully identifies clinically validated positive and negative biomarkers. Both theoretical analysis and empirical evaluation demonstrate its robustness and superiority over existing methods.

Technology Category

Application Category

📝 Abstract

Scientists often need to analyze the samples in a study that responded to treatment in order to refine their hypotheses and find potential causal drivers of response. Natural variation in outcomes makes teasing apart responders from non-responders a statistical inference problem. To handle latent responses, we introduce the causal two-groups (C2G) model, a causal extension of the classical two-groups model. The C2G model posits that treated samples may or may not experience an effect, according to some prior probability. We propose two empirical Bayes procedures for the causal two-groups model, one under semi-parametric conditions and another under fully nonparametric conditions. The semi-parametric model assumes additive treatment effects and is identifiable from observed data. The nonparametric model is unidentifiable, but we show it can still be used to test for response in each treated sample. We show empirically and theoretically that both methods for selecting responders control the false discovery rate at the target level with near-optimal power. We also propose two novel estimands of interest and provide a strategy for deriving estimand intervals in the unidentifiable nonparametric model. On a cancer immunotherapy dataset, the nonparametric C2G model recovers clinically-validated predictive biomarkers of both positive and negative outcomes. Code is available at https://github.com/tansey-lab/causal2groups.

Problem

Research questions and friction points this paper is trying to address.

Identifies treatment responders using statistical models

Controls false discovery rate with optimal power

Recovers predictive biomarkers in immunotherapy datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal two-groups model

Empirical Bayes procedures

Nonparametric model for testing

🔎 Similar Papers

No similar papers found.

Authors to Follow