🤖 AI Summary
To address logical inconsistency and unverifiable rule adherence in large language model (LLM) inference, this paper proposes the Rule-Guided Feedback (RGF) framework. Within a teacher-student paradigm, RGF embeds hard task-specific rules into an iterative feedback loop, enabling multi-stage teacher evaluation, rule-driven bias detection, non-answer-based structured feedback, and adaptive information-seeking to dynamically identify reasoning deviations and trigger active retrieval for uncertainty mitigation. Its core contribution is the first realization of *verifiably embedded* rule constraints and *controllable intervention* in the reasoning process—overcoming fundamental limitations of conventional fine-tuning and prompt engineering in ensuring logical consistency. Evaluated across diverse tasks—including Checkmate-in-One, Sonnet Writing, Penguins-in-a-Table, GSM8K, and StrategyQA—RGF achieves significant gains in both accuracy and rule compliance, with superior generalization performance over state-of-the-art baselines.
📝 Abstract
In this paper, we introduce Rule-Guided Feedback (RGF), a framework designed to enhance Large Language Model (LLM) performance through structured rule adherence and strategic information seeking. RGF implements a teacher-student paradigm where rule-following is forced through established guidelines. Our framework employs a Teacher model that rigorously evaluates each student output against task-specific rules, providing constructive guidance rather than direct answers when detecting deviations. This iterative feedback loop serves two crucial purposes: maintaining solutions within defined constraints and encouraging proactive information seeking to resolve uncertainties. We evaluate RGF on diverse tasks including Checkmate-in-One puzzles, Sonnet Writing, Penguins-In-a-Table classification, GSM8k, and StrategyQA. Our findings suggest that structured feedback mechanisms can significantly enhance LLMs' performance across various domains.