Can Generative AI agents behave like humans? Evidence from laboratory market experiments

📅 2025-05-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether large language models (LLMs) can replicate human bounded rationality in closed-loop dynamic market experiments. To this end, we design a multi-agent system incorporating real-time price feedback, a three-step rolling context window, and high-variance response strategies to simulate the evolution of positive- and negative-feedback markets. We systematically compare LLM agent behavior against empirical data from human participants in controlled laboratory markets. Our results provide the first empirical validation—within a closed-loop market setting—that LLM agents reproduce key macro-level market regularities observed in human subjects, including price convergence and characteristic volatility patterns, thereby exhibiting human-like bounded rationality. However, significant discrepancies persist at the micro-level, particularly in decision diversity and behavioral granularity. These findings establish a scalable, empirically grounded methodological framework for AI-driven economic simulation, enabling rigorous, testable hypotheses about agent-based market dynamics.

Technology Category

Application Category

📝 Abstract
We explore the potential of Large Language Models (LLMs) to replicate human behavior in economic market experiments. Compared to previous studies, we focus on dynamic feedback between LLM agents: the decisions of each LLM impact the market price at the current step, and so affect the decisions of the other LLMs at the next step. We compare LLM behavior to market dynamics observed in laboratory settings and assess their alignment with human participants' behavior. Our findings indicate that LLMs do not adhere strictly to rational expectations, displaying instead bounded rationality, similarly to human participants. Providing a minimal context window i.e. memory of three previous time steps, combined with a high variability setting capturing response heterogeneity, allows LLMs to replicate broad trends seen in human experiments, such as the distinction between positive and negative feedback markets. However, differences remain at a granular level--LLMs exhibit less heterogeneity in behavior than humans. These results suggest that LLMs hold promise as tools for simulating realistic human behavior in economic contexts, though further research is needed to refine their accuracy and increase behavioral diversity.
Problem

Research questions and friction points this paper is trying to address.

Can LLMs replicate human behavior in economic markets?
Do LLMs show bounded rationality like humans in markets?
Can LLMs simulate human market trends with minimal memory?
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic feedback between LLM agents
Minimal context window with memory
High variability setting for response heterogeneity
🔎 Similar Papers
No similar papers found.