Guided Persona-based AI Surveys: Can we replicate personal mobility preferences at scale using LLMs?

πŸ“… 2025-01-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Traditional travel surveys suffer from high costs, low efficiency, poor scalability, and privacy constraints. To address these challenges, this paper proposes a large language model (LLM)-based synthetic data generation method. Its core innovation is a structured β€œpersona” prompting mechanism that explicitly encodes demographic and behavioral attributes as conditional constraints to guide the LLM in generating high-fidelity travel preference data. The method is calibrated and evaluated on the real-world German Mobility in Germany (MiD) 2017 dataset. Compared with five state-of-the-art synthetic data generation approaches, it more accurately reproduces multidimensional dependencies among age, income, geography, and transport mode choice, reducing key error metrics by 32%. The framework ensures privacy preservation, incurs minimal computational cost, and supports scalable deployment. Moreover, it enables interpretable, scenario-based transportation policy analysis through controllable, attribute-conditioned generation.

Technology Category

Application Category

πŸ“ Abstract
This study explores the potential of Large Language Models (LLMs) to generate artificial surveys, with a focus on personal mobility preferences in Germany. By leveraging LLMs for synthetic data creation, we aim to address the limitations of traditional survey methods, such as high costs, inefficiency and scalability challenges. A novel approach incorporating"Personas"- combinations of demographic and behavioural attributes - is introduced and compared to five other synthetic survey methods, which vary in their use of real-world data and methodological complexity. The MiD 2017 dataset, a comprehensive mobility survey in Germany, serves as a benchmark to assess the alignment of synthetic data with real-world patterns. The results demonstrate that LLMs can effectively capture complex dependencies between demographic attributes and preferences while offering flexibility to explore hypothetical scenarios. This approach presents valuable opportunities for transportation planning and social science research, enabling scalable, cost-efficient and privacy-preserving data generation.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Personal Travel Preferences
Data Generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
synthetic surveys
Personal Mobility Preferences
πŸ”Ž Similar Papers
No similar papers found.
I
Ioannis Tzachristas
Chair of Transportation Systems Engineering, TUM School of Engineering and Design, TUM, Germany
S
Santhanakrishnan Narayanan
Chair of Transportation Systems Engineering, TUM School of Engineering and Design, TUM, Germany
Constantinos Antoniou
Constantinos Antoniou
Full Professor, Technical University of Munich (TUM)
Transportation systemsbig data analyticsdata-driven approachestraffic simulationroad safety