Retrieval, Reasoning, Re-ranking: A Context-Enriched Framework for Knowledge Graph Completion

📅 2024-11-12

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

147K/year

🤖 AI Summary

Knowledge graph completion (KGC) faces dual challenges: embedding-based methods are sensitive to noisy relations and long-tail entities, while text-based approaches suffer from semantic gaps between structured triples and natural language. Method: This paper proposes a retrieval–reasoning–reranking framework that jointly leverages structured triples and rich entity contexts—including labels, descriptions, and aliases—for the first time in KGC end-to-end pipelines. It introduces an LLM-driven dual-path candidate generation mechanism and a differentiable, context-aware reranking module, enabling synergistic structural and semantic reasoning. The framework integrates graph-based retrieval augmentation, LLM-based inference, and fine-tuned reranking, and is compatible with conventional embedding models (e.g., RotatE) for initial candidate generation. Contribution/Results: On FB15k237 and WN18RR, the method achieves absolute Hits@1 improvements of 12.3% and 5.6%, respectively, significantly outperforming state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract

The Knowledge Graph Completion~(KGC) task aims to infer the missing entity from an incomplete triple. Existing embedding-based methods rely solely on triples in the KG, which is vulnerable to specious relation patterns and long-tail entities. On the other hand, text-based methods struggle with the semantic gap between KG triples and natural language. Apart from triples, entity contexts (e.g., labels, descriptions, aliases) also play a significant role in augmenting KGs. To address these limitations, we propose KGR3, a context-enriched framework for KGC. KGR3 is composed of three modules. Firstly, the Retrieval module gathers supporting triples from the KG, collects plausible candidate answers from a base embedding model, and retrieves context for each related entity. Then, the Reasoning module employs a large language model to generate potential answers for each query triple. Finally, the Re-ranking module combines candidate answers from the two modules mentioned above, and fine-tunes an LLM to provide the best answer. Extensive experiments on widely used datasets demonstrate that KGR3 consistently improves various KGC methods. Specifically, the best variant of KGR3 achieves absolute Hits@1 improvements of 12.3% and 5.6% on the FB15k237 and WN18RR datasets.

Problem

Research questions and friction points this paper is trying to address.

Infer missing entities in incomplete knowledge graph triples

Address limitations of embedding-based and text-based KGC methods

Enhance KGC using entity contexts and multi-module reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval module gathers triples and entity contexts

Reasoning module uses LLM to generate potential answers

Re-ranking module combines and fine-tunes LLM outputs

🔎 Similar Papers

The Role of Graph Topology in the Performance of Biomedical Knowledge Graph Completion Models