Exploring the Potential of Large Language Models for Heterophilic Graphs

📅 2024-08-26

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

To address the limited node classification performance on heterophilous graphs—where neighboring nodes exhibit substantial label disparity—this paper proposes the first two-stage framework that deeply integrates large language models (LLMs) into heterophilous graph modeling. First, an LLM discriminates the semantic types of edges from textual features and adaptively reweights them to enhance message propagation efficacy under heterophily. Second, knowledge distillation from the LLM guides the construction of a lightweight graph neural network (GNN), retaining over 95% of the original performance while drastically reducing inference overhead. This work pioneers the coupling of LLMs’ semantic understanding with structural graph modeling, enabling edge-level semantic-aware adaptive aggregation. Evaluated on multiple standard heterophilous graph benchmarks, the method achieves significant improvements in node classification accuracy, demonstrating the effectiveness, interpretability, and deployment practicality of LLM-augmented heterophilous graph learning.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have presented significant opportunities to enhance various machine learning applications, including graph neural networks (GNNs). By leveraging the vast open-world knowledge within LLMs, we can more effectively interpret and utilize textual data to better characterize heterophilic graphs, where neighboring nodes often have different labels. However, existing approaches for heterophilic graphs overlook the rich textual data associated with nodes, which could unlock deeper insights into their heterophilic contexts. In this work, we explore the potential of LLMs for modeling heterophilic graphs and propose a novel two-stage framework: LLM-enhanced edge discriminator and LLM-guided edge reweighting. In the first stage, we fine-tune the LLM to better identify homophilic and heterophilic edges based on the textual content of their nodes. In the second stage, we adaptively manage message propagation in GNNs for different edge types based on node features, structures, and heterophilic or homophilic characteristics. To cope with the computational demands when deploying LLMs in practical scenarios, we further explore model distillation techniques to fine-tune smaller, more efficient models that maintain competitive performance. Extensive experiments validate the effectiveness of our framework, demonstrating the feasibility of using LLMs to enhance node classification on heterophilic graphs.

Problem

Research questions and friction points this paper is trying to address.

Enhancing heterophilic graph interpretation with LLMs

Leveraging textual data for node classification

Optimizing GNNs through LLM-guided edge reweighting

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-enhanced edge discriminator

LLM-guided edge reweighting

Model distillation for efficiency

🔎 Similar Papers

LLM-Enhanced User-Item Interactions: Leveraging Edge Information for Optimized Recommendations