🤖 AI Summary
This study investigates how narrative framing influences reward-based AI decision-making, focusing on the interplay between reinforcement learning (RL) and language model (LM) reasoning. To address this, we propose the first dual-system architecture that integrates interpretable, narrative-driven LM reasoning into the RL decision pipeline. A modular narrative parameter interface enables controlled manipulation of narrative type, character roles, and causal structure—without altering environment dynamics or reward functions—thereby enabling narrative-guided policy learning. Experiments are conducted in a configurable grid-world environment, supported by a multi-dimensional logging system that synchronously records state-action trajectories, Q-values, and natural-language reasoning traces. Results demonstrate that narrative framing significantly shifts policy preferences and exploration behavior, confirming dynamic interaction between symbolic narrative reasoning and numerical optimization. This work establishes a novel paradigm for explainable, value-aligned AI decision-making.
📝 Abstract
We present a preliminary experimental platform that explores how narrative elements might shape AI decision-making by combining reinforcement learning (RL) with language model reasoning. While AI systems can now both make decisions and engage in narrative reasoning, these capabilities have mostly been studied separately. Our platform attempts to bridge this gap using a dual-system architecture to examine how narrative frameworks could influence reward-based learning. The system comprises a reinforcement learning policy that suggests actions based on past experience, and a language model that processes these suggestions through different narrative frameworks to guide decisions. This setup enables initial experimentation with narrative elements while maintaining consistent environment and reward structures. We implement this architecture in a configurable gridworld environment, where agents receive both policy suggestions and information about their surroundings. The platform's modular design facilitates controlled testing of environmental complexity, narrative parameters, and the interaction between reinforcement learning and narrative-based decisions. Our logging system captures basic decision metrics, from RL policy values to language model reasoning to action selection patterns. While preliminary, this implementation provides a foundation for studying how different narrative frameworks might affect reward-based decisions and exploring potential interactions between optimization-based learning and symbolic reasoning in AI systems.