🤖 AI Summary
Large language models (LLMs) suffer significant performance degradation on knowledge graph question answering (KGQA) involving unseen entities, primarily due to incomplete knowledge graphs causing entity linking failures and retrieval of only partially relevant evidence—such as explicit facts, implicit cues, or weakly related triples.
Method: We propose a novel “Partially Relevant Knowledge Activation” paradigm within a retrieval-augmented generation (RAG) framework. It constructs context from KG triple variants to encode partial relevance and employs prompt learning to activate LLMs’ latent reasoning capabilities, supported by theoretical analysis and empirical validation.
Contribution/Results: We formally define the “unseen-entity KGQA” task for the first time, relaxing RAG’s traditional reliance on complete, accurate knowledge. Evaluated on two KGQA benchmarks, our method effectively suppresses noise interference and achieves substantial accuracy gains over embedding-similarity-based baselines.
📝 Abstract
Retrieval-Augmented Generation (RAG) shows impressive performance by supplementing and substituting parametric knowledge in Large Language Models (LLMs). Retrieved knowledge can be divided into three types: explicit answer evidence, implicit answer clue, and insufficient answer context which can be further categorized into totally irrelevant and partially relevant information. Effectively utilizing partially relevant knowledge remains a key challenge for RAG systems, especially in incomplete knowledge base retrieval. Contrary to the conventional view, we propose a new perspective: LLMs can be awakened via partially relevant knowledge already embedded in LLMs. To comprehensively investigate this phenomenon, the triplets located in the gold reasoning path and their variants are used to construct partially relevant knowledge by removing the path that contains the answer. We provide theoretical analysis of the awakening effect in LLMs and support our hypothesis with experiments on two Knowledge Graphs (KGs) Question Answering (QA) datasets. Furthermore, we present a new task, Unseen Entity KGQA, simulating real-world challenges where entity linking fails due to KG incompleteness. Our awakening-based approach demonstrates greater efficacy in practical applications, outperforms traditional methods that rely on embedding-based similarity which are prone to returning noisy information.