Leveraging Large Language Models for Building Interpretable Rule-Based Data-to-Text Systems

📅 2025-02-28
🏛️ International Conference on Natural Language Generation
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address high hallucination rates, poor interpretability, and substantial inference overhead in end-to-end LLM-based data-to-text generation, this paper proposes the first framework that leverages large language models (LLMs) as automatic rule-system generators. Through prompt engineering, the LLM extracts logical rules from the WebNLG dataset and synthesizes executable, parameter-free Python code—yielding a fully interpretable, zero-parameter, rule-driven generation system. Our approach preserves the transparency and computational efficiency of classical rule-based systems while significantly outperforming direct LLM prompting (higher BLEU/BLEURT scores), achieving lower hallucination rates than fine-tuned BART, and delivering >10× inference speedup—enabling real-time execution on a single CPU core. The core contribution is the first realization of an LLM-powered, fully automated, formally verifiable, and lightweight rule-generation paradigm.

Technology Category

Application Category

📝 Abstract
We introduce a simple approach that uses a large language model (LLM) to automatically implement a fully interpretable rule-based data-to-text system in pure Python. Experimental evaluation on the WebNLG dataset showed that such a constructed system produces text of better quality (according to the BLEU and BLEURT metrics) than the same LLM prompted to directly produce outputs, and produces fewer hallucinations than a BART language model fine-tuned on the same data. Furthermore, at runtime, the approach generates text in a fraction of the processing time required by neural approaches, using only a single CPU.
Problem

Research questions and friction points this paper is trying to address.

Automates interpretable rule-based data-to-text systems using LLMs.
Improves text quality and reduces hallucinations compared to direct LLM outputs.
Reduces runtime processing time significantly compared to neural approaches.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLM for interpretable rule-based systems
Improves text quality with BLEU and BLEURT metrics
Reduces processing time using single CPU
J
Jędrzej Warczyński
Poznan University of Technology, Faculty of Computing and Telecommunications, Poznan, Poland
Mateusz Lango
Mateusz Lango
Charles University / Poznan University of Technology
natural language processingmachine learningexplainable AI
O
Ondrej Dusek
Charles University, Faculty of Mathematics and Physics, Prague, Czechia