A Reverse Causal Framework to Mitigate Spurious Correlations for Debiasing Scene Graph Generation.

📅 2025-05-09
🏛️ IEEE Transactions on Pattern Analysis and Machine Intelligence
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing two-stage scene graph generation (SGG) frameworks adopt a causal chained training paradigm, inducing spurious correlations between detector inputs and final predictions. This leads to two systematic biases: (i) tail relations being misclassified as head relations, and (ii) foreground relations being erroneously labeled as background—a long-overlooked issue. To address this, we propose Reverse-causal SGG (RcSGG), the first SGG framework that explicitly models classifier inputs as confounders. RcSGG employs Active Reverse Estimation (ARE) and Maximum Information Sampling (MIS) to disentangle non-causal dependencies. As the inaugural reverse-causal modeling paradigm for SGG, it systematically identifies and mitigates foreground–background confusion bias. Evaluated on standard benchmarks, RcSGG achieves state-of-the-art mean recall, significantly reducing misclassification errors while improving generalization and fairness across relation categories.

Technology Category

Application Category

📝 Abstract
Existing two-stage Scene Graph Generation (SGG) frameworks typically incorporate a detector to extract relationship features and a classifier to categorize these relationships; therefore, the training paradigm follows a causal chain structure, where the detector's inputs determine the classifier's inputs, which in turn influence the final predictions. However, such a causal chain structure can yield spurious correlations between the detector's inputs and the final predictions, i.e., the prediction of a certain relationship may be influenced by other relationships. This influence can induce at least two observable biases: tail relationships are predicted as head ones, and foreground relationships are predicted as background ones; notably, the latter bias is seldom discussed in the literature. To address this issue, we propose reconstructing the causal chain structure into a reverse causal structure, wherein the classifier's inputs are treated as the confounder, and both the detector's inputs and the final predictions are viewed as causal variables. Specifically, we term the reconstructed causal paradigm as the Reverse causal Framework for SGG (RcSGG). RcSGG initially employs the proposed Active Reverse Estimation (ARE) to intervene on the confounder to estimate the reverse causality, i.e., the causality from final predictions to the classifier's inputs. Then, the Maximum Information Sampling (MIS) is suggested to enhance the reverse causality estimation further by considering the relationship information. Theoretically, RcSGG can mitigate the spurious correlations inherent in the SGG framework, subsequently eliminating the induced biases. Comprehensive experiments on popular benchmarks and diverse SGG frameworks show the state-of-the-art mean recall rate.
Problem

Research questions and friction points this paper is trying to address.

Mitigates spurious correlations in Scene Graph Generation
Addresses biases in relationship prediction (tail/head, foreground/background)
Proposes Reverse Causal Framework (RcSGG) for debiasing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reverse causal structure for debiasing SGG
Active Reverse Estimation for causality
Maximum Information Sampling enhances estimation
🔎 Similar Papers
Shuzhou Sun
Shuzhou Sun
University of Oulu
Deep learningComputer visionCausal inference
L
Li Liu
College of Electronic Science and Technology, National University of Defense Technology (NUDT), Changsha, Hunan, China
T
Tianpeng Liu
College of Electronic Science and Technology, National University of Defense Technology (NUDT), Changsha, Hunan, China
Shuaifeng Zhi
Shuaifeng Zhi
Imperial College London
Ming-Ming Cheng
Ming-Ming Cheng
Professor of Computer Science, Nankai University
Computer VisionComputer GraphicsVisual AttentionSaliency
J
Janne Heikkila
Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, 90570 Oulu, Finland
Yongxiang Liu
Yongxiang Liu
Professor, National University of Defense Technology
Remote SensingSynthetic Aperture RadarRadarImage ProcessingPattern Recognition