Iterative LLM-Based Generation and Refinement of Distracting Conditions in Math Word Problems

📅 2025-10-07

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Existing mathematical word problem (MWP) benchmarks lack realistic distractors; those few datasets incorporating distractors suffer from low difficulty, semantically implausible constructions, and easy detectability by models—undermining evaluation validity. Moreover, manually crafting distractors necessitates rewriting solution derivations, incurring high annotation costs. This paper proposes an iterative, large language model–based framework for generating semantically coherent, imperceptible distractors that preserve the original problem’s solution path and ground-truth answer. Our approach employs multi-round cognitive-guided prompting to iteratively refine distractor generation. Key contributions include: (1) eliminating the need for answer or solution rewriting, thereby drastically reducing human verification effort; and (2) enabling scalable construction of high-quality, challenging distractor-augmented MWPs. Experiments demonstrate that models fine-tuned on our augmented data exhibit significantly improved sensitivity to irrelevant information. Consequently, our method establishes a more robust and reliable benchmark for assessing mathematical reasoning capabilities.

Technology Category

Application Category

📝 Abstract

Mathematical reasoning serves as a crucial testbed for evaluating the intelligence of large language models (LLMs), and math word problems (MWPs) represent one of the most widely used formats. Most existing MWP datasets contain only the necessary information, while problems with distracting or excessive conditions are often overlooked. Prior studies have shown that popular LLMs experience a dramatic performance drop when such distracting conditions are introduced. However, available datasets of MWPs with distracting conditions remain limited, and most exhibit low difficulty and out-of-context expressions. These shortcomings make the distracting conditions easy to detect and disregard, thereby reducing the credibility of benchmarking on these datasets. Moreover, when distracting conditions are added, the reasoning process and answers may change, requiring intensive manual effort to check and rewrite solutions. To address these issues, we design an iterative framework that leverages LLMs to generate distracting conditions automatically. We develop a set of prompts to revise MWPs from multiple perspectives and cognitive levels, encouraging the creation of meaningful distracting conditions as well as suggestions for further refinement. A key advantage of our framework is the preservation of shared solutions between the original and revised problems: the LLMs are explicitly guided to generate distractions that do not alter the original solution, thus eliminating the need to produce new answers. This framework is efficient and easy to deploy, substantially reducing the effort required to generate MWPs with distracting conditions while maintaining high data quality.

Problem

Research questions and friction points this paper is trying to address.

Automatically generates distracting conditions in math word problems

Preserves original solutions when adding distracting information

Reduces manual effort for creating challenging math problems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative LLM framework generates distracting conditions

Prompts create distractions preserving original solutions

Automated refinement reduces manual effort in problem generation

🔎 Similar Papers

Cutting Through the Noise: Boosting LLM Performance on Math Word Problems