🤖 AI Summary
Existing game-testing agents—whether RL-, IL-, or LLM-based—exhibit severe behavioral homogeneity, failing to emulate the diverse strategies arising from human players’ individual personality differences; this leads to insufficient interaction coverage and low detection rates for edge cases. To address this, we propose the first systematic integration of personality traits into game-testing agents, introducing an LLM-based multi-strategy decision framework. Our approach employs personality trait vectors to dynamically modulate the joint output of reinforcement learning and imitation learning policies, enabling behaviorally distinct actions under similar contextual conditions. Evaluated across multiple games, our method significantly improves task completion rates and state coverage. In Minecraft, it surpasses state-of-the-art baselines, achieving broader, deeper, and more expressive game testing—demonstrating enhanced capability in uncovering rare scenarios and complex interactions.
📝 Abstract
Modern video games pose significant challenges for traditional automated testing algorithms, yet intensive testing is crucial to ensure game quality. To address these challenges, researchers designed gaming agents using Reinforcement Learning, Imitation Learning, or Large Language Models. However, these agents often neglect the diverse strategies employed by human players due to their different personalities, resulting in repetitive solutions in similar situations. Without mimicking varied gaming strategies, these agents struggle to trigger diverse in-game interactions or uncover edge cases.
In this paper, we present MIMIC, a novel framework that integrates diverse personality traits into gaming agents, enabling them to adopt different gaming strategies for similar situations. By mimicking different playstyles, MIMIC can achieve higher test coverage and richer in-game interactions across different games. It also outperforms state-of-the-art agents in Minecraft by achieving a higher task completion rate and providing more diverse solutions. These results highlight MIMIC's significant potential for effective game testing.