🤖 AI Summary
This study addresses the challenge of accurately predicting individual travel mode choice to support transportation planning and policy-making. It proposes a novel three-stage approach that integrates statistical feature analysis with large language models (LLMs): first identifying key trip characteristics, then converting structured data into natural language descriptions, and finally leveraging zero-shot or few-shot learning enhanced with domain-informed prompts for prediction. The method innovatively combines feature-driven natural language modeling with injected domain knowledge. Comprehensive evaluations assess the performance of GPT-4o, o3-mini, and o4-mini under few-shot and domain-enhanced prompting settings. Results show that LLM-based approaches achieve accuracy comparable to state-of-the-art classifiers, with o3-mini yielding up to a 42.9% improvement using only five examples. Domain-enhanced prompts significantly boost general-purpose models (e.g., GPT-4o gains of 2.27%–12.50%), though effects vary across reasoning-oriented architectures.
📝 Abstract
Understanding traveler behavior and accurately predicting travel mode choice are at the heart of transportation planning and policy-making. This study proposes TransMode-LLM, an innovative framework that integrates statistical methods with LLM-based techniques to predict travel modes from travel survey data. The framework operates through three phases: (1) statistical analysis identifies key behavioral features, (2) natural language encoding transforms structured data into contextual descriptions, and (3) LLM adaptation predicts travel mode through multiple learning paradigms including zero-shot and one/few-shot learning and domain-enhanced prompting. We evaluate TransMode-LLM using both general-purpose models (GPT-4o, GPT-4o-mini) and reasoning-focused models (o3-mini, o4-mini) with varying sample sizes on real-world travel survey data. Extensive experiment results demonstrate that the LLM-based approach achieves competitive accuracy compared to state-of-the-art baseline classifiers models. Moreover, few-shot learning significantly improves prediction accuracy, with models like o3-mini showing consistent improvements of up to 42.9\% with 5 provided examples. However, domain-enhanced prompting shows divergent effects across LLM architectures. In detail, it is helpful to improve performance for general-purpose models with GPT-4o achieving improvements of 2.27% to 12.50%. However, for reasoning-oriented models (o3-mini, o4-mini), domain knowledge enhancement does not universally improve performance. This study advances the application of LLMs in travel behavior modeling, providing promising and valuable insights for both academic research and transportation policy-making in the future.