Mobility Anomaly Generation using LLM-Driven Behavior with Kinematic Constraints

πŸ“… 2026-06-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the scarcity of real-world human trajectory datasets with precise anomaly annotations, a limitation stemming from the rarity of anomalous events, high data collection costs, and stringent privacy regulations. To bridge this gap, the paper introduces the first end-to-end framework for generating realistic anomalous trajectories. The approach leverages large language models (LLMs) to inject semantically plausible anomalous behaviors, integrates map-constrained path planning, and employs an environment-aware spatial noise model to synthesize trajectories that adhere to physical laws while preserving realism. By uniquely combining LLM-driven behavioral modeling with kinematic constraints, the method enables the creation of the first large-scale, accurately annotated dataset of anomalous human trajectories, substantially narrowing the sim-to-real gap in trajectory analysis.
πŸ“ Abstract
Although the study of human trajectory anomalies is critical for advancing spatial data mining, empirical research remains severely hindered by a pervasive lack of ground-truth datasets. Despite the availability of several real-world and simulated human trajectory collections, these datasets exclusively capture normal mobility patterns and lack annotated anomalies. This specific scarcity is fundamentally driven by the inherent statistical rarity of anomalous events, precluding the feasibility of conventional observational methods. Compounding this challenge, the systematic acquisition of large-scale mobility data is strictly bottlenecked by prohibitive costs and stringent privacy regulations. To overcome these fundamental limitations and establish a reliable human trajectory anomalies dataset with annotated ground truth, we introduce a novel, end-to-end generative framework designed to synthesize realistic trajectory anomalies at scale. Our architecture bridges the gap between purely synthetic mobility data and complex real-world physical constraints by operating directly on baseline simulated trajectories. We employ Large Language Model (LLM) agents to systematically inject semantically meaningful behavioral anomalies such as irregular out-of-distribution check-ins and skipped routine visits. To ensure rigorous spatial validity, the system leverages map-constrained routing reconstruction to recalculate the physical transitions between these LLM agent-modified staypoints. Moreover, to narrow the simulation-to-reality gap, we augment the resulting trajectories with a context-aware spatial noise model, parameterized by environmental and location-specific variables, to accurately emulate heterogeneous GPS sensor degradation.
Problem

Research questions and friction points this paper is trying to address.

human trajectory anomalies
ground-truth dataset
mobility data
anomaly detection
data scarcity
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven anomaly generation
kinematic constraints
map-constrained routing
context-aware spatial noise
synthetic trajectory dataset