π€ AI Summary
To address scalability limitations, poor linguistic quality, and weak factual consistency in automated question-answer (QA) pair generation from knowledge graphs (KGs), this paper proposes a deterministic KG-QA generation framework. Methodologically, it constructs a structured template library by integrating relation clustering with entity-type constraints, leverages large language models (LLMs) for semantic paraphrasing and factual verification, and introduces a KG-based distractor generation strategy to enhance QA robustness. The key contributions are: (i) the first integration of relation-semantic clustering with type-aware template generation, enabling high-fidelity, interpretable, and reproducible QA synthesis; and (ii) lightweight LLM integration to balance linguistic naturalness and factual accuracy. Experiments on multiple KG benchmarks demonstrate substantial improvements over state-of-the-art baselines across all three dimensions: scalability (generation throughput), linguistic quality (BLEU and LaMP scores), and factual accuracy (F1 > 92%), establishing new SOTA performance.
π Abstract
The generation of questions and answers (QA) from knowledge graphs (KG) plays a crucial role in the development and testing of educational platforms, dissemination tools, and large language models (LLM). However, existing approaches often struggle with scalability, linguistic quality, and factual consistency. This paper presents a scalable and deterministic pipeline for generating natural language QA from KGs, with an additional refinement step using LLMs to further enhance linguistic quality. The approach first clusters KG triplets based on their relations, creating reusable templates through natural language rules derived from the entity types of objects and relations. A module then leverages LLMs to refine these templates, improving clarity and coherence while preserving factual accuracy. Finally, the instantiation of answer options is achieved through a selection strategy that introduces distractors from the KG. Our experiments demonstrate that this hybrid approach efficiently generates high-quality QA pairs, combining scalability with fluency and linguistic precision.