🤖 AI Summary
This paper addresses severe hallucination and out-of-distribution errors in LLM-generated SPARQL queries caused by incorrect URI generation. To this end, we propose PGMR—a novel framework featuring a post-generation, non-parametric memory retrieval mechanism. PGMR decouples language modeling from structured knowledge graph (KG) element generation: an LLM first produces a query draft; then, external KG indexing enables similarity-driven, precise URI retrieval; finally, a lightweight re-ranking strategy refines candidates. This design ensures factual consistency in URI generation. Extensive evaluation across multiple KGQA benchmarks and mainstream LLMs demonstrates that PGMR significantly suppresses URI hallucination—reducing it to near-zero in several scenarios—and substantially improves SPARQL execution accuracy. PGMR establishes a new paradigm for trustworthy SPARQL query generation in knowledge graph question answering.
📝 Abstract
The ability to generate SPARQL queries from natural language questions is crucial for ensuring efficient and accurate retrieval of structured data from knowledge graphs (KG). While large language models (LLMs) have been widely adopted for SPARQL query generation, they are often susceptible to hallucinations and out-of-distribution errors when producing KG elements like Uniform Resource Identifiers (URIs) based on internal parametric knowledge. This often results in content that appears plausible but is factually incorrect, posing significant challenges for their use in real-world information retrieval (IR) applications. This has led to increased research aimed at detecting and mitigating such errors. In this paper, we introduce PGMR (Post-Generation Memory Retrieval), a modular framework that incorporates a non-parametric memory module to retrieve KG elements and enhance LLM-based SPARQL query generation. Our experimental results indicate that PGMR consistently delivers strong performance across diverse datasets, data distributions, and LLMs. Notably, PGMR significantly mitigates URI hallucinations, nearly eliminating the problem in several scenarios.