🤖 AI Summary
Existing large language models (LLMs) exhibit suboptimal performance on multiple-choice question (MCQ) tasks due to the lack of contextual grounding and explanatory justification for answer options, leading to insufficient reasoning and incomplete exploration. To address this, we propose BiasPrompting—a novel prompting framework featuring a bidirectional reasoning mechanism: in the first stage, the model is prompted to generate supportive justifications for each candidate option; in the second stage, these justifications are aggregated and evaluated for logical consistency to reach a consensus answer. BiasPrompting integrates prompt engineering with structured reasoning guidance, requiring no additional training, fine-tuning, or parameter updates. Evaluated across five mainstream MCQ benchmarks, it achieves significant performance gains—particularly on high-difficulty items—demonstrating strong generalization and effectively enhancing LLMs’ reasoning capabilities.
📝 Abstract
With the advancement of large language models (LLMs), their performance on multiple-choice question (MCQ) tasks has improved significantly. However, existing approaches face key limitations: answer choices are typically presented to LLMs without contextual grounding or explanation. This absence of context can lead to incomplete exploration of all possible answers, ultimately degrading the models' reasoning capabilities. To address these challenges, we introduce BiasPrompting, a novel inference framework that guides LLMs to generate and critically evaluate reasoning across all plausible answer options before reaching a final prediction. It consists of two components: first, a reasoning generation stage, where the model is prompted to produce supportive reasonings for each answer option, and then, a reasoning-guided agreement stage, where the generated reasonings are synthesized to select the most plausible answer. Through comprehensive evaluations, BiasPrompting demonstrates significant improvements in five widely used multiple-choice question answering benchmarks. Our experiments showcase that BiasPrompting enhances the reasoning capabilities of LLMs and provides a strong foundation for tackling complex and challenging questions, particularly in settings where existing methods underperform.