Partial Identification under High-Dimensional Potential Outcomes and Confounders via Optimal Transport

📅 2026-05-30

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This study addresses the challenge of causal partial identification when both potential outcomes and confounders are high-dimensional, a setting where existing optimal transport–based methods suffer from the curse of dimensionality, compromising either computational efficiency or statistical accuracy. To overcome this limitation, the authors propose a novel approach that decomposes the optimal transport problem into a low-dimensional signal subspace and a high-dimensional residual subspace. By innovatively integrating subspace projection with sliced Wasserstein distance, the method efficiently estimates the transport cost in the residual space while preserving residual information. The framework provides interpretable conditions for controlling approximation error and a data-driven criterion for selecting the signal subspace dimension. Under maintained computational feasibility, the proposed method substantially outperforms projection-only baselines, yielding tighter and more informative bounds for causal identification.

📝 Abstract

Partial identification provides informative causal guarantees when point identification is impossible, but existing approaches based on optimal transport (OT) become computationally and statistically intractable in high-dimensional settings. This limitation is particularly severe when both potential outcomes and confounders are high-dimensional, where classical OT-based bounds suffer from the curse of dimensionality and unfavorable convergence rates. To address this challenge, we propose a novel estimator that decomposes the transport problem into a low-dimensional signal subspace and a high-dimensional residual subspace. Unlike existing projection-based methods that discard residual information, we recover the residual transport energy using the Sliced Wasserstein distance, which is computationally efficient and robust to high dimensions. We establish interpretable conditions controlling the approximation gap based on residual structure and provide a data-driven rule for signal dimension selection. Empirical results show that our estimator consistently outperforms projection-only baselines by recovering lost transport energy, yielding more informative causal bounds while remaining computationally tractable in high dimensions.

Problem

Research questions and friction points this paper is trying to address.

Partial Identification

High-Dimensional

Optimal Transport

Potential Outcomes

Confounders

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal Transport

Partial Identification

High-Dimensional Causal Inference