Semi-Supervised Learning under General Causal Models

📅 2025-10-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the fundamental question of why unlabeled data improve generalization in semi-supervised learning (SSL), focusing on the challenge of modeling complex causal relationships between features and labels in realistic settings. We propose the first SSL framework applicable to general causal graph structures: leveraging a causal generative model, it infers latent causal mechanisms from unlabeled data and synthesizes high-fidelity pseudo-labels—thereby enhancing discriminative model performance without relying on strong distributional or structural assumptions. Our approach integrates causal graph identifiability, counterfactual reasoning, and consistency regularization, supporting diverse causal structures including confounding, mediation, and backdoor paths. Extensive experiments on synthetic data and multiple real-world benchmarks—including medical and image domains—demonstrate consistent superiority over state-of-the-art SSL and causal learning methods, achieving average accuracy gains of 3.2–7.8 percentage points.

Technology Category

Application Category

📝 Abstract
Semi-supervised learning (SSL) aims to train a machine learning model using both labelled and unlabelled data. While the unlabelled data have been used in various ways to improve the prediction accuracy, the reason why unlabelled data could help is not fully understood. One interesting and promising direction is to understand SSL from a causal perspective. In light of the independent causal mechanisms principle, the unlabelled data can be helpful when the label causes the features but not vice versa. However, the causal relations between the features and labels can be complex in real world applications. In this paper, we propose a SSL framework that works with general causal models in which the variables have flexible causal relations. More specifically, we explore the causal graph structures and design corresponding causal generative models which can be learned with the help of unlabelled data. The learned causal generative model can generate synthetic labelled data for training a more accurate predictive model. We verify the effectiveness of our proposed method by empirical studies on both simulated and real data.
Problem

Research questions and friction points this paper is trying to address.

Developing SSL framework for flexible causal feature-label relations
Exploiting unlabeled data to learn causal generative models
Generating synthetic labeled data to improve prediction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops SSL framework for general causal models
Designs causal generative models using unlabelled data
Generates synthetic labelled data to improve accuracy
🔎 Similar Papers
No similar papers found.
A
Archer Moore
School of Mathematics and Statistics, Faculty of Science, The University of Melbourne, Melbourne, Australia
H
Heejung Shim
School of Mathematics and Statistics, Faculty of Science, The University of Melbourne, Melbourne, Australia
Mingming Gong
Mingming Gong
University of Melbourne & Mohamed bin Zayed University of Artificial Intelligence
Causal InferenceMachine LearningComputer Vision
Jingge Zhu
Jingge Zhu
University of Melbourne
Information TheoryCommunication SystemsStatistical Learning Theory