🤖 AI Summary
Automotive cybersecurity suffers from a scarcity of high-quality attack data due to constraints on real-vehicle penetration testing, hindering the training and evaluation of Intrusion Detection Systems (IDS). To address this, we propose a context-aware CAN attack data generation method that integrates parametric attack modeling, semantic decoding of CAN messages, and dynamic attack intensity modulation—enabling high-fidelity generation of diverse attack logs, including DoS, fuzzing, and replay attacks. Crucially, our approach is the first to jointly model bus protocol characteristics and adversarial behavior, significantly enhancing data realism and scenario scalability. Experimental evaluation on two deep learning–based IDS models demonstrates that our generated data improves attack detection rate and classification accuracy by 12.6% and 9.3%, respectively. This enables efficient, reproducible, and realistic automotive security assessment.
📝 Abstract
The digital evolution of connected vehicles and the subsequent security risks emphasize the critical need for implementing in-vehicle cyber security measures such as intrusion detection and response systems. The continuous advancement of attack scenarios further highlights the need for adaptive detection mechanisms that can detect evolving, unknown, and complex threats. The effective use of ML-driven techniques can help address this challenge. However, constraints on implementing diverse attack scenarios on test vehicles due to safety, cost, and ethical considerations result in a scarcity of data representing attack scenarios. This limitation necessitates alternative efficient and effective methods for generating high-quality attack-representing data. This paper presents a context-aware attack data generator that generates attack inputs and corresponding in-vehicle network log, i.e., controller area network (CAN) log, representing various types of attack including denial of service (DoS), fuzzy, spoofing, suspension, and replay attacks. It utilizes parameterized attack models augmented with CAN message decoding and attack intensity adjustments to configure the attack scenarios with high similarity to real-world scenarios and promote variability. We evaluate the practicality of the generated attack-representing data within an intrusion detection system (IDS) case study, in which we develop and perform an empirical evaluation of two deep neural network IDS models using the generated data. In addition to the efficiency and scalability of the approach, the performance results of IDS models, high detection and classification capabilities, validate the consistency and effectiveness of the generated data as well. In this experience study, we also elaborate on the aspects influencing the fidelity of the data to real-world scenarios and provide insights into its application.