dcFCI: Robust Causal Discovery Under Latent Confounding, Unfaithfulness, and Mixed Data

📅 2025-05-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Causal discovery from observational data faces robustness challenges due to latent confounding, unfaithfulness, and mixed variable types (continuous/discrete). Method: We propose dcFCI—the first nonparametric, data-compatible adaptation of the FCI algorithm—featuring: (i) a novel necessary and sufficient nonparametric PAG compatibility score; (ii) an anytime-terminating hybrid causal discovery framework unifying treatment of all three challenges; and (iii) integration of nonparametric dependence testing, PAG representation, and mixed-variable conditional independence assessment. Results: Experiments demonstrate that dcFCI significantly outperforms state-of-the-art methods on small-sample and heterogeneous real-world datasets, achieving high fidelity in recovering the true PAG. Moreover, its output Top-k PAG set quantifies structural uncertainty, enabling robust causal inference and decision-making.

Technology Category

Application Category

📝 Abstract
Causal discovery is central to inferring causal relationships from observational data. In the presence of latent confounding, algorithms such as Fast Causal Inference (FCI) learn a Partial Ancestral Graph (PAG) representing the true model's Markov Equivalence Class. However, their correctness critically depends on empirical faithfulness, the assumption that observed (in)dependencies perfectly reflect those of the underlying causal model, which often fails in practice due to limited sample sizes. To address this, we introduce the first nonparametric score to assess a PAG's compatibility with observed data, even with mixed variable types. This score is both necessary and sufficient to characterize structural uncertainty and distinguish between distinct PAGs. We then propose data-compatible FCI (dcFCI), the first hybrid causal discovery algorithm to jointly address latent confounding, empirical unfaithfulness, and mixed data types. dcFCI integrates our score into an (Anytime)FCI-guided search that systematically explores, ranks, and validates candidate PAGs. Experiments on synthetic and real-world scenarios demonstrate that dcFCI significantly outperforms state-of-the-art methods, often recovering the true PAG even in small and heterogeneous datasets. Examining top-ranked PAGs further provides valuable insights into structural uncertainty, supporting more robust and informed causal reasoning and decision-making.
Problem

Research questions and friction points this paper is trying to address.

Robust causal discovery under latent confounding and unfaithfulness
Handling mixed data types in causal inference
Assessing PAG compatibility with observed data nonparametrically
Innovation

Methods, ideas, or system contributions that make the work stand out.

Nonparametric score for PAG compatibility with mixed data
Hybrid algorithm dcFCI addressing confounding and unfaithfulness
AnytimeFCI-guided search ranking and validating candidate PAGs
🔎 Similar Papers
No similar papers found.
A
Adele H. Ribeiro
University of Münster, Institute of Medical Informatics, Münster, Germany
Dominik Heider
Dominik Heider
Director, University of Münster
Data ScienceMachine LearningArtificial IntelligenceBiomedical InformaticsSaMD