🤖 AI Summary
This work addresses the limitations of existing knowledge base question answering approaches, where formal query generation is error-prone and lacks interpretability, while end-to-end answer retrieval suffers from high computational costs and hallucination. To overcome these issues, the authors propose DeSQ, a three-stage KB-agnostic framework that decomposes complex questions into interpretable, analyzable structured queries without relying on live knowledge base endpoints. The framework sequentially performs question decomposition, atomic constraint modeling with SPARQL fragment generation, and URI grounding. By integrating the strengths of both query generation and answer retrieval paradigms, DeSQ achieves state-of-the-art performance on four out of five mainstream benchmarks and demonstrates enhanced robustness under lexical perturbations.
📝 Abstract
Dominant approaches to Knowledge Base Question Answering (KBQA) fall into two categories. First is the generation of a formal query that suffers from brittleness and limited explainability, and the second is direct answer retrieval through KB exploration that is computationally costly and prone to hallucination. To combine the strengths of both paradigms while mitigating their respective weaknesses, we introduce DeSQ (Decomposition-based SPARQL Query Generation), a KB-agnostic framework that operates in three stages. First, it decomposes complex questions into Atomic Constraints (ACs) that mirror the relational structure of the underlying KB. Second, it generates a two-part structured output: (a) Mapping of each AC to its corresponding SPARQL Fragment, using standardized variable and URIs placeholders, and (b) URIs Grounding block describing each placeholder. Third, it assembles these fragments into a complete SPARQL query. DeSQ surpasses state-of-the-art approaches on four out of five major benchmarks and demonstrates superior robustness to lexical variation. Beyond performance gains, our framework greatly simplifies evaluation by eliminating the need for a live KB endpoint, and its structured output enables fine-grained error analysis, allowing more targeted interventions for improvement.