Predictive posteriors under hidden confounding

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the failure of cross-domain predictive generalization under distributional shift caused by latent confounding, this paper proposes the first Bayesian Generative Invariance (BGI) framework. Unlike frequentist generative invariance (GI) methods—which cannot identify causal structures and lack uncertainty quantification—BGI endows the number of observable environments with an asymptotic prior interpretation, enabling identifiable causal discovery and well-calibrated out-of-distribution prediction without hyperparameter tuning or prior knowledge of distribution shifts. The method integrates multi-environment joint Bayesian modeling with the principle of generative invariance, supporting robust high-dimensional inference. Experiments demonstrate that BGI achieves empirical coverage rates close to nominal levels in low- to medium-dimensional settings, significantly outperforming conservative regularization and existing frequentist approaches.

Technology Category

Application Category

📝 Abstract
Predicting outcomes in external domains is challenging due to hidden confounders that influence both predictors and outcomes, complicating generalization under distribution shifts. Traditional methods often rely on stringent assumptions or overly conservative regularization, compromising estimation and predictive accuracy. Generative Invariance (GI) is a novel framework that facilitates predictions in unseen domains without requiring hyperparameter tuning or knowledge of specific distribution shifts. However, the available frequentist version of GI does not always enable identification and lacks uncertainty quantification for its predictions. This paper develops a Bayesian formulation that extends GI with well-calibrated external predictions and facilitates causal discovery. We present theoretical guarantees showing that prior distributions assign asymptotic meaning to the number of distinct datasets that could be observed. Simulations and an application case highlight the remarkable empirical coverage behavior of our approach, nearly unchanged when transitioning from low- to moderate-dimensional settings.
Problem

Research questions and friction points this paper is trying to address.

Predicting outcomes with hidden confounders across domains
Overcoming limitations of traditional methods in generalization
Extending Generative Invariance with Bayesian uncertainty quantification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian formulation extends Generative Invariance framework
Enables well-calibrated external predictions and causal discovery
Maintains empirical coverage in varying dimensional settings
🔎 Similar Papers
No similar papers found.
Carlos García Meixide
Carlos García Meixide
ICMAT - Universidad Autónoma de Madrid
StatisticsMachine Learning
D
David Ríos Insua
Instituto de Ciencias Matemáticas, CSIC