🤖 AI Summary
This work addresses the limitations of existing approaches that integrate large language models with knowledge graphs, which suffer from weak operator expressiveness and poor scalability when injecting massive factual knowledge. To overcome these challenges, the authors propose a novel programmatic reasoning framework that, for the first time, models knowledge graph schemas as Python classes using an object-oriented paradigm. By generating executable code to perform iterative reasoning, the method avoids the need to directly inject vast numbers of facts. This approach seamlessly integrates large language models, knowledge graph retrieval, and code-level reasoning, enabling flexible composition and efficient scalability. Empirical results demonstrate significant improvements over prior state-of-the-art models, with performance gains of up to 10.5% on benchmark datasets including WebQSP, ComplexWebQuestions (CWQ), and GrailQA.
📝 Abstract
Knowledge Graphs (KGs) are widely used to mitigate the limitations of Large Language Models (LLMs), such as outdated knowledge and hallucinations. Existing LLM-KG integration frameworks typically rely on predefined operators to retrieve factual knowledge from KGs and inject it into prompts for answer generation. This paradigm faces two critical bottlenecks: 1) Inflexibility: The predefined operators are limited in scope and thus lack sufficient compositional expressiveness to fully capture the complex semantics required by KG questions. 2) Unscalability: Direct injection of factual knowledge into prompts limits scalability in handling large-scale factual knowledge. To address these two bottlenecks, we propose Code-on-Graph (CoG), a programmatic reasoning framework for LLM-KG integration. Specifically, given the factual knowledge retrieved at each reasoning step, CoG first identifies the corresponding KG schemas and represents these schemas as Python classes, which serve as abstract interfaces to the retrieved facts. It then generates executable code grounded in these classes, with the retrieved facts instantiated as objects of the corresponding classes during execution. This design enables flexible code-based reasoning while avoiding the direct injection of large-scale factual knowledge into prompts. Experiments on WebQSP, CWQ, and GrailQA demonstrate that CoG outperforms prior state-of-the-art models by up to 10.5%.