🤖 AI Summary
Multi-agent interactions exacerbate the opacity and unpredictability of AI decision-making, hindering trustworthy deployment. To address this, we propose FAIRGAME—a novel, standardized, and reproducible interpretability-testing framework that uniquely integrates game-theoretic modeling with large language model (LLM) agent bias analysis. FAIRGAME supports flexible configuration of LLMs, languages, personality traits, and strategic knowledge, enabling systematic quantification of biases across model, language, and strategy dimensions via multi-agent simulation, while predicting emergent behavioral patterns. Its key innovation lies in the deep embedding of formal game-theoretic analysis into the LLM agent evaluation pipeline, thereby enabling interpretable bias quantification and theory-grounded behavioral prediction. Extensive experiments on canonical game-theoretic scenarios demonstrate FAIRGAME’s effectiveness in bias detection, result reproducibility, and theoretical consistency.
📝 Abstract
Letting AI agents interact in multi-agent applications adds a layer of complexity to the interpretability and prediction of AI outcomes, with profound implications for their trustworthy adoption in research and society. Game theory offers powerful models to capture and interpret strategic interaction among agents, but requires the support of reproducible, standardized and user-friendly IT frameworks to enable comparison and interpretation of results. To this end, we present FAIRGAME, a Framework for AI Agents Bias Recognition using Game Theory. We describe its implementation and usage, and we employ it to uncover biased outcomes in popular games among AI agents, depending on the employed Large Language Model (LLM) and used language, as well as on the personality trait or strategic knowledge of the agents. Overall, FAIRGAME allows users to reliably and easily simulate their desired games and scenarios and compare the results across simulation campaigns and with game-theoretic predictions, enabling the systematic discovery of biases, the anticipation of emerging behavior out of strategic interplays, and empowering further research into strategic decision-making using LLM agents.