🤖 AI Summary
To address the challenge of detecting anomalous behaviors in connected and autonomous vehicles (CAVs) caused by sensor faults, cyberattacks, and environmental disturbances, this paper introduces a multi-vehicle interaction simulation dataset comprising time-series trajectories of position, velocity, and acceleration. We propose a stacked LSTM–Random Forest hybrid anomaly detection framework: the LSTM component precisely captures long-range temporal dependencies in driving behavior to achieve high-fidelity trajectory prediction (R² = 0.9998, MAE = 5.746), while the Random Forest enhances model interpretability and robustness in anomaly discrimination (R² = 0.9830). This architecture synergistically combines the representational power of deep learning with the decision transparency of tree-based models. Evaluated at the 95th percentile threshold, it achieves accurate anomaly identification with low false-positive rates. The framework thus provides a real-time, high-performance, and trustworthy solution for ensuring CAV operational safety.
📝 Abstract
Anomaly detection in connected autonomous vehicles (CAVs) is crucial for maintaining safe and reliable transportation networks, as CAVs can be susceptible to sensor malfunctions, cyber-attacks, and unexpected environmental disruptions. This study explores an anomaly detection approach by simulating vehicle behavior, generating a dataset that represents typical and atypical vehicular interactions. The dataset includes time-series data of position, speed, and acceleration for multiple connected autonomous vehicles. We utilized machine learning models to effectively identify abnormal driving patterns. First, we applied a stacked Long Short-Term Memory (LSTM) model to capture temporal dependencies and sequence-based anomalies. The stacked LSTM model processed the sequential data to learn standard driving behaviors. Additionally, we deployed a Random Forest model to support anomaly detection by offering ensemble-based predictions, which enhanced model interpretability and performance. The Random Forest model achieved an R2 of 0.9830, MAE of 5.746, and a 95th percentile anomaly threshold of 14.18, while the stacked LSTM model attained an R2 of 0.9998, MAE of 82.425, and a 95th percentile anomaly threshold of 265.63. These results demonstrate the models' effectiveness in accurately predicting vehicle trajectories and detecting anomalies in autonomous driving scenarios.