SymCode: A Neurosymbolic Approach to Mathematical Reasoning via Verifiable Code Generation

📅 2025-10-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from unreliable text generation and lack deterministic verification mechanisms for complex mathematical reasoning. To address this, we propose SymCode, a neuro-symbolic framework that reformulates mathematical reasoning as verifiable code generation—leveraging LLMs for high-level reasoning while delegating precise computation and validation to the SymPy symbolic engine. Its core innovation lies in replacing natural-language reasoning with programmatic outputs, thereby explicitly exposing errors and enabling automated correctness checking. Evaluated on MATH-500 and OlympiadBench, SymCode achieves a 13.6-percentage-point accuracy gain over state-of-the-art prompting methods (e.g., chain-of-thought), while reducing token consumption. This approach enhances the accuracy, trustworthiness, and computational efficiency of formal mathematical reasoning.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) often struggle with complex mathematical reasoning, where prose-based generation leads to unverified and arithmetically unsound solutions. Current prompting strategies like Chain of Thought still operate within this unreliable medium, lacking a mechanism for deterministic verification. To address these limitations, we introduce SymCode, a neurosymbolic framework that reframes mathematical problem-solving as a task of verifiable code generation using the SymPy library. We evaluate SymCode on challenging benchmarks, including MATH-500 and OlympiadBench, demonstrating significant accuracy improvements of up to 13.6 percentage points over baselines. Our analysis shows that SymCode is not only more token-efficient but also fundamentally shifts model failures from opaque logical fallacies towards transparent, programmatic errors. By grounding LLM reasoning in a deterministic symbolic engine, SymCode represents a key step towards more accurate and trustworthy AI in formal domains.
Problem

Research questions and friction points this paper is trying to address.

Addressing unreliable mathematical reasoning in LLMs through verifiable code generation
Improving accuracy in formal domains by grounding reasoning in symbolic engines
Shifting model failures from logical fallacies to transparent programmatic errors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates verifiable code using SymPy library
Reframes math reasoning as symbolic programming task
Grounds LLM outputs in deterministic symbolic engine
🔎 Similar Papers
No similar papers found.