Bayesian Nonparametric Causal Inference for High-Dimensional Nutritional Data via Factor-Based Exposure Mapping

📅 2026-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge posed by high-dimensional and highly correlated dietary intake data in causal inference, a key concern in nutritional epidemiology where interest lies in latent dietary patterns and their heterogeneous effects on health outcomes. To this end, the authors propose an innovative three-level ordinal exposure mapping framework tailored for high-dimensional nutritional data, integrating factor analysis with Bayesian nonparametric causal inference. They extend Bayesian Causal Forests to estimate heterogeneous causal effects across multiple dietary patterns. In both simulation studies and multicenter cohort analyses, the approach successfully identified six distinct dietary patterns, revealing that plant lipid–antioxidant, plant-based, animal protein, and dairy patterns are significantly associated with lower body mass index and fasting insulin levels.

Technology Category

Application Category

📝 Abstract
Diet plays a crucial role in health, and understanding the causal effects of dietary patterns is essential for informing public health policy and personalized nutrition strategies. However, causal inference in nutritional epidemiology faces several challenges: (i) high-dimensional and correlated food/nutrient intake data induce massive treatment levels; (ii) nutritional studies are interested in latent dietary patterns rather than single food items; and (iii) the goal is to estimate heterogeneous causal effects of these dietary patterns on health outcomes. We address these challenges by introducing a sophisticated exposure mapping framework that reduces the high-dimensional treatment space via factor analysis and enables the identification of dietary patterns. We also extend the Bayesian Causal Forest to accommodate three ordered levels of dietary exposure, better capturing the complex structure of nutritional data and enabling estimation of heterogeneous causal effects. We evaluate the proposed method through extensive simulations and apply it to a multi-center epidemiological study of Hispanic/Latino adults residing in the US. Using high-dimensional dietary data, we identify six dietary patterns and estimate their causal link with two key health risk factors: body mass index and fasting insulin levels. Our findings suggest that higher consumption of plant lipid-antioxidant, plant-based, animal protein, and dairy product patterns is associated with reduced risk.
Problem

Research questions and friction points this paper is trying to address.

causal inference
high-dimensional data
dietary patterns
heterogeneous treatment effects
nutritional epidemiology
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian Nonparametrics
Factor-Based Exposure Mapping
High-Dimensional Causal Inference
Dietary Patterns
Heterogeneous Treatment Effects
🔎 Similar Papers
No similar papers found.
Dafne Zorzetto
Dafne Zorzetto
Brown University
Bayesian Causal Inference
Z
Zizhao Xie
Department of Biostatistics, Brown University
J
Julian Stamp
Data Science Institute, Brown University; Center for Computational Molecular Biology, Brown University
Arman Oganisian
Arman Oganisian
Brown University
Bayesian nonparametricsCausal InferenceBayesian StatisticsMissing Data
R
R. D. Vito
Department of Biostatistics, Brown University; Department of Statistical Science, La Sapienza University of Rome