🤖 AI Summary
Large language models (LLMs) perform well in dynamic game interactions but struggle to consistently adhere to rigid, multi-step transaction protocols—such as browse-quote-review-confirm—undermining user trust and system verifiability. To address this, we propose Autoregressive State-Tracking Prompting (ASTP), a novel prompting framework that enforces explicit, stepwise state reporting via declarative state labels and applies state-specific placeholder-based post-processing to ensure precise price computation. ASTP significantly improves procedural compliance and numerical accuracy, especially for compact LLMs. Evaluated on 300 transaction dialogues, it achieves ≥99.0% state compliance and 99.3% price calculation accuracy, while reducing average response latency from 21.2 seconds to 2.4 seconds. The method thus enhances real-time responsiveness, auditability, and computational efficiency without compromising fidelity to protocol constraints.
📝 Abstract
Large Language Models (LLMs) enable dynamic game interactions but fail to follow essential procedural flows in rule-governed trading systems, eroding player trust. This work resolves the core tension between the creative flexibility of LLMs and the procedural demands of in-game trading (browse-offer-review-confirm). To this end, Autoregressive State-Tracking Prompting (ASTP) is introduced, a methodology centered on a strategically orchestrated prompt that compels an LLM to make its state-tracking process explicit and verifiable. Instead of relying on implicit contextual understanding, ASTP tasks the LLM with identifying and reporting a predefined state label from the previous turn. To ensure transactional integrity, this is complemented by a state-specific placeholder post-processing method for accurate price calculations. Evaluation across 300 trading dialogues demonstrates >99% state compliance and 99.3% calculation precision. Notably, ASTP with placeholder post-processing on smaller models (Gemini-2.5-Flash) matches larger models' (Gemini-2.5-Pro) performance while reducing response time from 21.2s to 2.4s, establishing a practical foundation that satisfies both real-time requirements and resource constraints of commercial games.