🤖 AI Summary
This paper addresses the identifiability of causal graphs under linear acyclic structural equation models (SEM), focusing on the key assumption that error terms are independent and **homoscedastic**. We propose a causal discovery method that selects the directed acyclic graph (DAG) minimizing the **total prediction squared error (TPE)**. We prove theoretically that TPE attains its global minimum uniquely at the true DAG, and increases strictly for any non-supergraph candidate; moreover, TPE remains unchanged only when the candidate DAG is a supergraph of the true one. Leveraging this property, we design a Bayesian DAG selection algorithm with provable consistency and uniform convergence. The method integrates linear regression, ordinary least squares estimation, and combinatorial DAG search—requiring neither restrictive functional-form assumptions nor non-Gaussianity of errors. Experiments confirm that, under homoscedasticity, our approach uniquely and consistently recovers the ground-truth causal structure, substantially strengthening both theoretical guarantees and practical applicability of causal discovery in linear SEMs.
📝 Abstract
We consider the problem of recovering the true causal structure among a set of variables, generated by a linear acyclic structural equation model (SEM) with the error terms being independent and having equal variances. It is well-known that the true underlying directed acyclic graph (DAG) encoding the causal structure is uniquely identifiable under this assumption. In this work, we establish that the sum of minimum expected squared errors for every variable, while predicted by the best linear combination of its parent variables, is minimised if and only if the causal structure is represented by any supergraph of the true DAG. This property is further utilised to design a Bayesian DAG selection method that recovers the true graph consistently.