KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

141K/year

🤖 AI Summary

To address scalability limitations, poor linguistic quality, and weak factual consistency in automated question-answer (QA) pair generation from knowledge graphs (KGs), this paper proposes a deterministic KG-QA generation framework. Methodologically, it constructs a structured template library by integrating relation clustering with entity-type constraints, leverages large language models (LLMs) for semantic paraphrasing and factual verification, and introduces a KG-based distractor generation strategy to enhance QA robustness. The key contributions are: (i) the first integration of relation-semantic clustering with type-aware template generation, enabling high-fidelity, interpretable, and reproducible QA synthesis; and (ii) lightweight LLM integration to balance linguistic naturalness and factual accuracy. Experiments on multiple KG benchmarks demonstrate substantial improvements over state-of-the-art baselines across all three dimensions: scalability (generation throughput), linguistic quality (BLEU and LaMP scores), and factual accuracy (F1 > 92%), establishing new SOTA performance.

Technology Category

Application Category

📝 Abstract

The generation of questions and answers (QA) from knowledge graphs (KG) plays a crucial role in the development and testing of educational platforms, dissemination tools, and large language models (LLM). However, existing approaches often struggle with scalability, linguistic quality, and factual consistency. This paper presents a scalable and deterministic pipeline for generating natural language QA from KGs, with an additional refinement step using LLMs to further enhance linguistic quality. The approach first clusters KG triplets based on their relations, creating reusable templates through natural language rules derived from the entity types of objects and relations. A module then leverages LLMs to refine these templates, improving clarity and coherence while preserving factual accuracy. Finally, the instantiation of answer options is achieved through a selection strategy that introduces distractors from the KG. Our experiments demonstrate that this hybrid approach efficiently generates high-quality QA pairs, combining scalability with fluency and linguistic precision.

Problem

Research questions and friction points this paper is trying to address.

Generating scalable QA pairs from knowledge graphs with linguistic quality

Ensuring factual consistency while improving template clarity through LLMs

Creating effective distractors for QA instantiation from knowledge graphs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Template-driven QA generation from knowledge graphs

LLM-based refinement for linguistic quality enhancement

Distractor selection strategy from knowledge graph entities

🔎 Similar Papers

CuriousLLM: Elevating Multi-Document Question Answering with LLM-Enhanced Knowledge Graph Reasoning