Model Generalization on Text Attribute Graphs: Principles with Large Language Models

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address weak zero-shot generalization on text-attributed graphs (TAGs) caused by label scarcity, excessive neighborhood context, and misalignment between node embeddings and LLM token spaces, this paper proposes two core innovations: (1) task-adaptive unified textual attribute embedding, which aligns node semantics with the LLM’s input token space; and (2) an LLM-parameterized, transferable graph information aggregation mechanism (LLM-BP), integrating enhanced belief propagation with prompt-driven encoding. Evaluated on 11 real-world TAG benchmarks, our approach significantly outperforms state-of-the-art methods: task-conditioned embedding yields an 8.10% improvement, while adaptive aggregation contributes an additional 1.71% gain. The method effectively alleviates context-length constraints and strengthens cross-task generalization capability.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have recently been introduced to graph learning, aiming to extend their zero-shot generalization success to tasks where labeled graph data is scarce. Among these applications, inference over text-attributed graphs (TAGs) presents unique challenges: existing methods struggle with LLMs' limited context length for processing large node neighborhoods and the misalignment between node embeddings and the LLM token space. To address these issues, we establish two key principles for ensuring generalization and derive the framework LLM-BP accordingly: (1) Unifying the attribute space with task-adaptive embeddings, where we leverage LLM-based encoders and task-aware prompting to enhance generalization of the text attribute embeddings; (2) Developing a generalizable graph information aggregation mechanism, for which we adopt belief propagation with LLM-estimated parameters that adapt across graphs. Evaluations on 11 real-world TAG benchmarks demonstrate that LLM-BP significantly outperforms existing approaches, achieving 8.10% improvement with task-conditional embeddings and an additional 1.71% gain from adaptive aggregation.
Problem

Research questions and friction points this paper is trying to address.

Extend LLM zero-shot generalization to graph learning
Address LLMs' context length limitation in TAGs
Align node embeddings with LLM token space
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based encoders for task-adaptive embeddings
Belief propagation with LLM-estimated parameters
Unified attribute space for enhanced generalization
🔎 Similar Papers
No similar papers found.