Filter-then-Generate: Large Language Models with Structure-Text Adapter for Knowledge Graph Completion

📅 2024-12-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address hallucination, poor structural modeling, and limited entity scalability of large language models (LLMs) in knowledge graph completion (KGC), this paper proposes a “filter-then-generate” two-stage paradigm. First, candidate triples are filtered via structure-aware ego-graph serialization prompts; second, LLMs (LLaMA/Qwen) are prompted in a multiple-choice question-answering format to generate final predictions. A lightweight structure–text adapter is further introduced to enable contextual alignment between graph-structural and linguistic representations. This work is the first to explicitly formulate KGC as a structure-guided multiple-choice reasoning task. Evaluated on standard benchmarks including FB15k-237 and WN18RR, our method achieves an average MRR improvement of over 8% relative to state-of-the-art approaches. The code and instruction-tuning dataset are publicly released.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) present massive inherent knowledge and superior semantic comprehension capability, which have revolutionized various tasks in natural language processing. Despite their success, a critical gap remains in enabling LLMs to perform knowledge graph completion (KGC). Empirical evidence suggests that LLMs consistently perform worse than conventional KGC approaches, even through sophisticated prompt design or tailored instruction-tuning. Fundamentally, applying LLMs on KGC introduces several critical challenges, including a vast set of entity candidates, hallucination issue of LLMs, and under-exploitation of the graph structure. To address these challenges, we propose a novel instruction-tuning-based method, namely FtG. Specifically, we present a extit{filter-then-generate} paradigm and formulate the KGC task into a multiple-choice question format. In this way, we can harness the capability of LLMs while mitigating the issue casused by hallucinations. Moreover, we devise a flexible ego-graph serialization prompt and employ a structure-text adapter to couple structure and text information in a contextualized manner. Experimental results demonstrate that FtG achieves substantial performance gain compared to existing state-of-the-art methods. The instruction dataset and code are available at url{https://github.com/LB0828/FtG}.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Knowledge Graph Completion
Structural Information Utilization
Innovation

Methods, ideas, or system contributions that make the work stand out.

FtG
Knowledge Graph Completion
Prompt Engineering
B
Ben Liu
School of Computer Science, Wuhan University, China
J
Jihai Zhang
DAMO Academy, Alibaba Group, Hangzhou, 310023, China
Fangquan Lin
Fangquan Lin
Senior Expert & Staff Engineer, Alibaba Group
Large Language ModelDeep LearningInformation RetrievalRecommender System
C
Cheng Yang
DAMO Academy, Alibaba Group, Hangzhou, 310023, China
M
Min Peng
School of Computer Science, Wuhan University, China