An Infectious Disease Spread Simulation Based on Large Language Model Decision Making

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This study addresses the modeling of individual self-reporting decision-making during infectious disease outbreaks to support precision public health interventions. To this end, we develop a spatially explicit agent-based simulation framework that integrates real census data and, for the first time, deeply couples large language model–generated individual decisions with fine-grained geographic and social structures. The framework incorporates contextual factors such as household influence and information framing. Our approach systematically reveals income and education level as key drivers of disparities in influenza-like illness reporting rates and successfully captures behavioral heterogeneity across both social and geographic dimensions in simulations of San Francisco and Atlanta. This work establishes a novel paradigm for high-resolution behavioral epidemiological modeling.

📝 Abstract

Modelling individual decision-making during infectious disease outbreaks is crucial for understanding behavioural dynamics and informing effective public health interventions. Prior work has shown that large language models can simulate realistic human behaviour by generating agent decisions based on demographic prompts and situational context. We build on this foundation with a spatially grounded, agent-based simulation framework that integrates LLM-generated decisions about self-reported influenza-like illness into a census-based synthetic population of agents. Location is treated as a central feature: agents are assigned to spatial units within cities, capturing the spatial distributions of different demographic groups using real-world census data and enabling geographically diverse behavioural modelling. We implement and compare three decision scenarios, independent reasoning, household influence, and message framing, and simulate self-reporting outcomes in San Francisco and Atlanta. Results reveal that income and education are the dominant drivers of reporting rate variation, with smaller but consistent effects from geography, LLM model choice, and message framing. Our framework generates synthetic data that captures both social and geographic heterogeneity, supporting spatial epidemiological modelling and bias-aware behavioural analysis.

Problem

Research questions and friction points this paper is trying to address.

infectious disease spread

individual decision-making

spatial heterogeneity

agent-based simulation

behavioral dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

large language model

agent-based simulation

spatial epidemiology