🤖 AI Summary
Robust node classification on graph data with unknown, uncharacterized label noise—where neither noise type nor rate is available a priori.
Method: We propose DeGLIF, the first method to extend the leave-one-out (LOO) influence function to graph neural networks (GNNs). Leveraging a small set of clean validation nodes, DeGLIF quantitatively measures the influence of each training node on validation loss to automatically detect and correct noisy labels—without modeling noise distributions or estimating noise parameters. Its noise identification mechanism is theoretically falsifiable.
Results: On multiple benchmark graph datasets, DeGLIF consistently outperforms state-of-the-art label-noise mitigation methods, achieving average accuracy gains of 3.2–5.7 percentage points. Empirical results demonstrate strong robustness and generalization under realistic, unstructured label noise.
📝 Abstract
Noisy labelled datasets are generally inexpensive compared to clean labelled datasets, and the same is true for graph data. In this paper, we propose a denoising technique DeGLIF: Denoising Graph Data using Leave-One-Out Influence Function. DeGLIF uses a small set of clean data and the leave-one- out influence function to make label noise robust node-level prediction on graph data. Leave-one-out influence function approximates the change in the model parameters if a training point is removed from the training dataset. Recent advances propose a way to calculate the leave-one-out influence function for Graph Neural Networks (GNNs). We extend that recent work to estimate the change in validation loss, if a training node is removed from the training dataset. We use this estimate and a new theoretically motivated relabelling function to denoise the training dataset. We propose two DeGLIF variants to identify noisy nodes. Both these variants do not require any information about the noise model or the noise level in the dataset; DeGLIF also does not estimate these quantities. For one of these variants, we prove that the noisy points detected can indeed increase risk. We carry out detailed computational experiments on different datasets to show the effectiveness of DeGLIF. It achieves better accuracy than other baseline algorithms