DAGAF: A directed acyclic generative adversarial framework for joint structure learning and tabular data synthesis

📅 2026-04-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that existing causal learning methods, often relying on a single identifiable model, struggle to simultaneously achieve accurate causal structure discovery and high-quality tabular data generation. To overcome this limitation, the authors propose a two-stage generative adversarial framework that unifies additive noise models (ANM), linear non-Gaussian acyclic models (LiNGAM), and post-nonlinear (PNL) models for the first time. The method implicitly encodes causal relationships among variables through a directed acyclic graph and introduces a multi-objective loss function to jointly optimize both structure learning and data synthesis. Theoretical analysis and extensive experiments demonstrate that the proposed approach significantly outperforms current state-of-the-art methods on benchmark datasets such as Sachs, Child, Hailfinder, and Pathfinder, reducing structural Hamming distance (SHD) scores by up to 47% while generating data with high fidelity and diversity.
📝 Abstract
Understanding the causal relationships between data variables can provide crucial insights into the construction of tabular datasets. Most existing causality learning methods typically focus on applying a single identifiable causal model, such as the Additive Noise Model (ANM) or the Linear non-Gaussian Acyclic Model (LiNGAM), to discover the dependencies exhibited in observational data. We improve on this approach by introducing a novel dual-step framework capable of performing both causal structure learning and tabular data synthesis under multiple causal model assumptions. Our approach uses Directed Acyclic Graphs (DAG) to represent causal relationships among data variables. By applying various functional causal models including ANM, LiNGAM and the Post-Nonlinear model (PNL), we implicitly learn the contents of DAG to simulate the generative process of observational data, effectively replicating the real data distribution. This is supported by a theoretical analysis to explain the multiple loss terms comprising the objective function of the framework. Experimental results demonstrate that DAGAF outperforms many existing methods in structure learning, achieving significantly lower Structural Hamming Distance (SHD) scores across both real-world and benchmark datasets (Sachs: 47%, Child: 11%, Hailfinder: 5%, Pathfinder: 7% improvement compared to state-of-the-art), while being able to produce diverse, high-quality samples.
Problem

Research questions and friction points this paper is trying to address.

causal structure learning
tabular data synthesis
directed acyclic graph
generative adversarial framework
functional causal models
Innovation

Methods, ideas, or system contributions that make the work stand out.

causal structure learning
tabular data synthesis
Directed Acyclic Graph (DAG)
generative adversarial framework
functional causal models
🔎 Similar Papers
No similar papers found.
H
Hristo Petkov
Department of Computer and Information Sciences, University of Strathclyde, 16 Richmond Street, Glasgow, G1 1XQ, Lanarkshire, United Kingdom
C
Calum MacLellan
Department of Computer and Information Sciences, University of Strathclyde, 16 Richmond Street, Glasgow, G1 1XQ, Lanarkshire, United Kingdom
Feng Dong
Feng Dong
Tsinghua University, School of Economics and Management
MacroeconomicsMonetary EconomicsFinancial EconomicsChinese Economy