๐ค AI Summary
Existing unified approaches to structured data question answering rely on predefined functions, limiting their capacity to handle complex reasoning tasks. This work proposes CRAFTQA, a framework that enables flexible reasoning over heterogeneous structured dataโsuch as tables and knowledge graphsโby generating end-to-end executable Python code sequences. CRAFTQA introduces a dual-module architecture comprising CodeSTEP and CRAFT, where the latter dynamically synthesizes custom functions, thereby overcoming the constraints of fixed, predefined operations. Experimental results demonstrate that CRAFTQA significantly outperforms current unified methods across multiple structured question answering benchmarks, with particularly notable gains in scenarios requiring complex multi-hop or compositional reasoning.
๐ Abstract
Real-world scenarios involve massive heterogeneous structured data (e.g., tables, knowledge graphs), making effective reasoning over such diverse data increasingly important. Unified structured data question answering has emerged as a prominent research trend, aiming to answer natural language questions across different structured data types within a single framework. However, existing unified methods share a common limitation: they rely on a set of predefined functions, which restricts their ability to perform complex reasoning beyond these predefined operations. To overcome this fundamental limitation, we propose CRAFTQA, a novel adaptive code-driven framework comprising two core modules, CodeSTEP and CRAFT. The CodeSTEP module is a paradigm that generates a complete executable Python code sequence, which contains step-by-step code-based reasoning operations based on the question. The CRAFT module dynamically generates custom code functions for operations beyond the predefined function set, and seamlessly integrates with CodeSTEP to significantly enhance flexibility in handling complex reasoning. Comprehensive experiments on multiple structured datasets demonstrate that CRAFTQA achieves remarkable improvements in complex reasoning scenarios compared to existing unified methods.