Aligning Large Language Models with Procedural Rules: An Autoregressive State-Tracking Prompting for In-Game Trading

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) perform well in dynamic game interactions but struggle to consistently adhere to rigid, multi-step transaction protocols—such as browse-quote-review-confirm—undermining user trust and system verifiability. To address this, we propose Autoregressive State-Tracking Prompting (ASTP), a novel prompting framework that enforces explicit, stepwise state reporting via declarative state labels and applies state-specific placeholder-based post-processing to ensure precise price computation. ASTP significantly improves procedural compliance and numerical accuracy, especially for compact LLMs. Evaluated on 300 transaction dialogues, it achieves ≥99.0% state compliance and 99.3% price calculation accuracy, while reducing average response latency from 21.2 seconds to 2.4 seconds. The method thus enhances real-time responsiveness, auditability, and computational efficiency without compromising fidelity to protocol constraints.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) enable dynamic game interactions but fail to follow essential procedural flows in rule-governed trading systems, eroding player trust. This work resolves the core tension between the creative flexibility of LLMs and the procedural demands of in-game trading (browse-offer-review-confirm). To this end, Autoregressive State-Tracking Prompting (ASTP) is introduced, a methodology centered on a strategically orchestrated prompt that compels an LLM to make its state-tracking process explicit and verifiable. Instead of relying on implicit contextual understanding, ASTP tasks the LLM with identifying and reporting a predefined state label from the previous turn. To ensure transactional integrity, this is complemented by a state-specific placeholder post-processing method for accurate price calculations. Evaluation across 300 trading dialogues demonstrates >99% state compliance and 99.3% calculation precision. Notably, ASTP with placeholder post-processing on smaller models (Gemini-2.5-Flash) matches larger models' (Gemini-2.5-Pro) performance while reducing response time from 21.2s to 2.4s, establishing a practical foundation that satisfies both real-time requirements and resource constraints of commercial games.
Problem

Research questions and friction points this paper is trying to address.

LLMs fail to follow procedural rules in game trading systems
Resolving tension between LLM flexibility and procedural trading demands
Ensuring transactional integrity with accurate state tracking and calculations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoregressive State-Tracking Prompting for explicit state verification
State-specific placeholder post-processing for accurate calculations
Smaller models achieve performance parity with faster response times
🔎 Similar Papers
No similar papers found.
M
Minkyung Kim
SayBerryGames
Junsik Kim
Junsik Kim
Amazon
Computer VisionMachine Learning
W
Woongcheol Yang
SayBerryGames
S
Sangdon Park
SayBerryGames
S
Sohee Bae
SayBerryGames