Code-on-Graph: Iterative Programmatic Reasoning via Large Language Models on Knowledge Graphs

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

136K/year

🤖 AI Summary

This work addresses the limitations of existing approaches that integrate large language models with knowledge graphs, which suffer from weak operator expressiveness and poor scalability when injecting massive factual knowledge. To overcome these challenges, the authors propose a novel programmatic reasoning framework that, for the first time, models knowledge graph schemas as Python classes using an object-oriented paradigm. By generating executable code to perform iterative reasoning, the method avoids the need to directly inject vast numbers of facts. This approach seamlessly integrates large language models, knowledge graph retrieval, and code-level reasoning, enabling flexible composition and efficient scalability. Empirical results demonstrate significant improvements over prior state-of-the-art models, with performance gains of up to 10.5% on benchmark datasets including WebQSP, ComplexWebQuestions (CWQ), and GrailQA.

📝 Abstract

Knowledge Graphs (KGs) are widely used to mitigate the limitations of Large Language Models (LLMs), such as outdated knowledge and hallucinations. Existing LLM-KG integration frameworks typically rely on predefined operators to retrieve factual knowledge from KGs and inject it into prompts for answer generation. This paradigm faces two critical bottlenecks: 1) Inflexibility: The predefined operators are limited in scope and thus lack sufficient compositional expressiveness to fully capture the complex semantics required by KG questions. 2) Unscalability: Direct injection of factual knowledge into prompts limits scalability in handling large-scale factual knowledge. To address these two bottlenecks, we propose Code-on-Graph (CoG), a programmatic reasoning framework for LLM-KG integration. Specifically, given the factual knowledge retrieved at each reasoning step, CoG first identifies the corresponding KG schemas and represents these schemas as Python classes, which serve as abstract interfaces to the retrieved facts. It then generates executable code grounded in these classes, with the retrieved facts instantiated as objects of the corresponding classes during execution. This design enables flexible code-based reasoning while avoiding the direct injection of large-scale factual knowledge into prompts. Experiments on WebQSP, CWQ, and GrailQA demonstrate that CoG outperforms prior state-of-the-art models by up to 10.5%.

Problem

Research questions and friction points this paper is trying to address.

Knowledge Graphs

Large Language Models

programmatic reasoning

compositional expressiveness

scalability

Innovation

Methods, ideas, or system contributions that make the work stand out.

programmatic reasoning

knowledge graphs

large language models