🤖 AI Summary
This paper addresses causal discovery from overlapping multi-source data with latent variables, incorporating hierarchical background knowledge into a constraint-based framework. To enhance robustness and interpretability of structure learning under partial identifiability, we propose two novel algorithms: tFCI, which encodes hierarchical priors as constraints to refine the Markov equivalence class; and tIOD, which—within an overlapping-data integration setting—rigorously proves superior efficiency and information completeness over standard IOD. Theoretical analysis establishes the reliability and asymptotic consistency of both methods under hierarchical knowledge assumptions. Empirical evaluation demonstrates that our approach yields more compact and interpretable causal graphs, achieving dual advances in theoretical soundness and practical inference performance.
📝 Abstract
In this paper we consider the use of tiered background knowledge within constraint based causal discovery. Our focus is on settings relaxing causal sufficiency, i.e. allowing for latent variables which may arise because relevant information could not be measured at all, or not jointly, as in the case of multiple overlapping datasets. We first present novel insights into the properties of the 'tiered FCI' (tFCI) algorithm. Building on this, we introduce a new extension of the IOD (integrating overlapping datasets) algorithm incorporating tiered background knowledge, the 'tiered IOD' (tIOD) algorithm. We show that under full usage of the tiered background knowledge tFCI and tIOD are sound, while simple versions of the tIOD and tFCI are sound and complete. We further show that the tIOD algorithm can often be expected to be considerably more efficient and informative than the IOD algorithm even beyond the obvious restriction of the Markov equivalence classes. We provide a formal result on the conditions for this gain in efficiency and informativeness. Our results are accompanied by a series of examples illustrating the exact role and usefulness of tiered background knowledge.