🤖 AI Summary
Existing hard/soft constraint methods for causal structure learning lack robustness against erroneous prior knowledge and require manual intervention. Method: We propose a robust edge-level prior-error-resilient approach that first formally defines and proves “quasi-cycles” as the necessary and sufficient condition for significant degradation in Structural Hamming Distance (SHD) induced by prior errors; based on this, we establish a theoretical linkage between error categories and SHD impact, enabling confidence-free, posteriori automatic correction. Our method integrates causal discovery algorithms, DAG constraint verification, SHD sensitivity analysis, and a graph-theoretic error localization mechanism. Results: Extensive experiments on synthetic and real-world datasets demonstrate substantial robustness improvement—particularly against order-reversal errors—while preserving valid prior information to the greatest extent.
📝 Abstract
Causal structure learning (CSL), a prominent technique for encoding cause-and-effect relationships among variables, through Bayesian Networks (BNs). Although recovering causal structure solely from data is a challenge, the integration of prior knowledge, revealing partial structural truth, can markedly enhance learning quality. However, current methods based on prior knowledge exhibit limited resilience to errors in the prior, with hard constraint methods disregarding priors entirely, and soft constraints accepting priors based on a predetermined confidence level, which may require expert intervention. To address this issue, we propose a strategy resilient to edge-level prior errors for CSL, thereby minimizing human intervention. We classify prior errors into different types and provide their theoretical impact on the Structural Hamming Distance (SHD) under the presumption of sufficient data. Intriguingly, we discover and prove that the strong hazard of prior errors is associated with a unique acyclic closed structure, defined as " quasi-circle". Leveraging this insight, a post-hoc strategy is employed to identify the prior errors by its impact on the increment of " quasi-circles". Through empirical evaluation on both real and synthetic datasets, we demonstrate our strategy's robustness against prior errors. Specifically, we highlight its substantial ability to resist order-reversed errors while maintaining the majority of correct prior.