🤖 AI Summary
Causal discovery algorithms often rely on strong, empirically unverifiable assumptions, leading to poor robustness on real-world data. This paper systematically evaluates differentiable causal discovery methods under eight canonical model misspecification scenarios—including nonlinearity, heteroscedasticity, and confounding—using Structural Hamming Distance (SHD) and Structural Intervention Distance (SID) for quantitative assessment. Results show that the method significantly outperforms conventional approaches across all misspecification settings except scale transformations, demonstrating superior robustness. We further provide theoretical analysis elucidating its intrinsic tolerance to misspecifications in functional form and noise structure. This work establishes a more realistic, comprehensive evaluation paradigm for causal discovery algorithms and contributes both theoretical insights and empirical evidence toward robust causal learning.
📝 Abstract
Causal discovery aims to learn causal relationships between variables from targeted data, making it a fundamental task in machine learning. However, causal discovery algorithms often rely on unverifiable causal assumptions, which are usually difficult to satisfy in real-world data, thereby limiting the broad application of causal discovery in practical scenarios. Inspired by these considerations, this work extensively benchmarks the empirical performance of various mainstream causal discovery algorithms, which assume i.i.d. data, under eight model assumption violations. Our experimental results show that differentiable causal discovery methods exhibit robustness under the metrics of Structural Hamming Distance and Structural Intervention Distance of the inferred graphs in commonly used challenging scenarios, except for scale variation. We also provide the theoretical explanations for the performance of differentiable causal discovery methods. Finally, our work aims to comprehensively benchmark the performance of recent differentiable causal discovery methods under model assumption violations, and provide the standard for reasonable evaluation of causal discovery, as well as to further promote its application in real-world scenarios.