π€ AI Summary
This work addresses the high maintenance cost of traditional logic-based visual question answering systems, which require extensive manual rule adjustments when tasks change. The authors propose a novel approach that leverages large language models (LLMs) to distill interpretable Answer Set Programming (ASP) rules from only a few examples, guided by natural language prompts. By integrating feedback from an ASP solver, the method automatically corrects erroneous rules and dynamically extends its reasoning theory. This is the first study to employ LLMs for the automatic induction of neuro-symbolic ASP rules. Evaluated on multiple visual question answering benchmarks, the approach achieves high rule accuracy with minimal training samples, significantly outperforming conventional rule-learning methods and substantially reducing the overhead of rule maintenance.
π Abstract
Visual Question Answering (VQA) is the task of answering questions about images, requiring the integration of multimodal input and reasoning. Modular approaches that incorporate logic-based representations into the reasoning component offer clear advantages over end-to-end trained systems, particularly in terms of interpretability. However, adapting or extending these representations when task requirements change can place a significant burden on developers. To address this challenge, we present an approach for distilling rules from Large Language Models (LLMs). Our method prompts an LLM to extend an initial VQA reasoning theory, expressed as an answer-set program, to meet new requirements of the task. Examples from VQA datasets guide the LLM, validate the results, and help correct erroneous rules by leveraging feedback from the ASP solver. We demonstrate that our approach is effective across diverse VQA datasets. Notably, only a few examples are needed to elicit correct rules from LLMs. Our experiments suggest that rule distillation from LLMs is a promising alternative to traditional data-driven rule learning approaches. Under consideration in Theory and Practice of Logic Programming (TPLP).