A tutorial on discovering and quantifying the effect of latent causal sources of multimodal EHR data

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limited causal interpretability in large-scale multimodal electronic health records (EHRs). We propose a generalizable causal machine learning framework: first, latent variable decomposition is performed via probabilistic independent component analysis to isolate underlying causal sources; second, multimodal clinical features are integrated to construct a task-specific causal inference model enabling individualized causal effect estimation. Our key methodological advance lies in jointly modeling latent variable discovery and causal structure learning—thereby mitigating challenges inherent to EHRs, including high noise, incompleteness, and strong confounding. Evaluated on two real-world clinical tasks—sepsis prognosis prediction and ICU medication response analysis—the framework successfully identifies medically interpretable latent causal factors and significantly improves both accuracy and robustness of individual causal effect estimation. This work establishes a novel paradigm for data-driven causal discovery in medicine.

Technology Category

Application Category

📝 Abstract
We provide an accessible description of a peer-reviewed generalizable causal machine learning pipeline to (i) discover latent causal sources of large-scale electronic health records observations, and (ii) quantify the source causal effects on clinical outcomes. We illustrate how imperfect multimodal clinical data can be processed, decomposed into probabilistic independent latent sources, and used to train taskspecific causal models from which individual causal effects can be estimated. We summarize the findings of the two real-world applications of the approach to date as a demonstration of its versatility and utility for medical discovery at scale.
Problem

Research questions and friction points this paper is trying to address.

Discover latent causal sources in multimodal EHR data
Quantify causal effects of sources on clinical outcomes
Process imperfect clinical data for causal modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discovers latent causal sources from EHR data
Decomposes data into probabilistic independent sources
Trains task-specific causal models for effect estimation
🔎 Similar Papers
No similar papers found.
M
Marco Barbero-Mota
Department of Biomedical Informatics, Vanderbilt University Medical Center
Eric V. Strobl
Eric V. Strobl
University of Pittsburgh
Causal DiscoveryCausal InferenceTranslational BioinformaticsComputational Psychiatry
J
John M. Still
Department of Biomedical Informatics, Vanderbilt University Medical Center
W
William W. Stead
Departments of Medicine & Biomedical Informatics, Vanderbilt University Medical Center
T
Thomas A. Lasko
Departments of Biomedical Informatics & Computer Science, Vanderbilt University Medical Center & Vanderbilt University