Are LLMs Better GNN Helpers? Rethinking Robust Graph Learning under Deficiencies with Iterative Refinement

📅 2025-10-02
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
Real-world graph data often exhibit multiple defects—including noise, missing values, and inconsistencies—that severely degrade the performance of Graph Neural Networks (GNNs). Prior work predominantly addresses individual defects in isolation, lacking systematic robustness evaluation of both conventional GNNs and emerging LLM-on-graph approaches under composite defects. This paper presents the first empirical comparative study, revealing that LLM augmentation is not universally superior and exhibits notable fragility under structural–textual misalignment. To address this, we propose Iterative Retrieval-Augmented Contrastive Refinement (IRCR), a novel framework integrating retrieval-augmented generation (RAG), graph contrastive learning, dynamic feature enhancement, and class-consistency regularization—transforming static feature injection into an iterative retrieve–generate–contrast optimization process. Extensive experiments on multiple text-attributed graph benchmarks demonstrate that IRCR achieves an average performance gain of 82.43%, significantly outperforming both traditional and LLM-enhanced baselines.

Technology Category

Application Category

📝 Abstract
Graph Neural Networks (GNNs) are widely adopted in Web-related applications, serving as a core technique for learning from graph-structured data, such as text-attributed graphs. Yet in real-world scenarios, such graphs exhibit deficiencies that substantially undermine GNN performance. While prior GNN-based augmentation studies have explored robustness against individual imperfections, a systematic understanding of how graph-native and Large Language Models (LLMs) enhanced methods behave under compound deficiencies is still missing. Specifically, there has been no comprehensive investigation comparing conventional approaches and recent LLM-on-graph frameworks, leaving their merits unclear. To fill this gap, we conduct the first empirical study that benchmarks these two lines of methods across diverse graph deficiencies, revealing overlooked vulnerabilities and challenging the assumption that LLM augmentation is consistently superior. Building on empirical findings, we propose Robust Graph Learning via Retrieval-Augmented Contrastive Refinement (RoGRAD) framework. Unlike prior one-shot LLM-as-Enhancer designs, RoGRAD is the first iterative paradigm that leverages Retrieval-Augmented Generation (RAG) to inject retrieval-grounded augmentations by supplying class-consistent, diverse augmentations and enforcing discriminative representations through iterative graph contrastive learning. It transforms LLM augmentation for graphs from static signal injection into dynamic refinement. Extensive experiments demonstrate RoGRAD's superiority over both conventional GNN- and LLM-enhanced baselines, achieving up to 82.43% average improvement.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking GNN and LLM methods under compound graph deficiencies
Investigating overlooked vulnerabilities in LLM-enhanced graph learning
Developing iterative refinement framework for robust graph representation learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative paradigm using Retrieval-Augmented Generation
Dynamic refinement via graph contrastive learning
Injecting class-consistent diverse augmentations
Z
Zhaoyan Wang
School of Computing, KAIST, Daejeon, Republic of Korea
Z
Zheng Gao
School of Computer Science and Engineering, UNSW, Sydney, NSW, Australia
A
Arogya Kharel
School of Computing, KAIST, Daejeon, Republic of Korea
In-Young Ko
In-Young Ko
Korea Advanced Institute of Science and Technology
Software EngineeringWeb EngineeringServices Computing