MathMistake Checker: A Comprehensive Demonstration for Step-by-Step Math Problem Mistake Finding by Prompt-Guided LLMs

📅 2025-03-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the challenge of automated fine-grained error localization in lengthy mathematical reasoning processes. Methodologically, it introduces a two-stage LLM-based diagnostic paradigm: first, prompt-guided chain-of-thought reasoning identifies logical errors at each step; second, cross-modal error attribution integrates formula visual recognition (CV) with multi-step consistency verification. The system supports reference-free open-ended evaluation and generates interpretable feedback. Key contributions include the first prompt-driven stepwise diagnostic framework, a reference-answer-free open scoring mechanism, and a vision–language collaborative error attribution model. Evaluated on computational and word problems, the system achieves 92.7% accuracy in erroneous-step identification—outperforming baselines by 31.4 percentage points. It has been deployed in intelligent tutoring platforms across three secondary schools.

Technology Category

Application Category

📝 Abstract

We propose a novel system, MathMistake Checker, designed to automate step-by-step mistake finding in mathematical problems with lengthy answers through a two-stage process. The system aims to simplify grading, increase efficiency, and enhance learning experiences from a pedagogical perspective. It integrates advanced technologies, including computer vision and the chain-of-thought capabilities of the latest large language models (LLMs). Our system supports open-ended grading without reference answers and promotes personalized learning by providing targeted feedback. We demonstrate its effectiveness across various types of math problems, such as calculation and word problems.

Problem

Research questions and friction points this paper is trying to address.

Automates step-by-step mistake finding in math problems

Simplifies grading and enhances learning experiences

Supports open-ended grading and personalized feedback

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage process for mistake finding

Integrates computer vision and LLMs

Supports open-ended grading and feedback

🔎 Similar Papers

No similar papers found.

Authors to Follow