Machine Learning for Network Attacks Classification and Statistical Evaluation of Machine Learning for Network Attacks Classification and Adversarial Learning Methodologies for Synthetic Data Generation

📅 2026-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses key challenges in network intrusion detection—namely data scarcity, privacy sensitivity, and insufficient model robustness—by introducing a novel multimodal dataset that unifies traffic, payload, and temporal contextual features into a cohesive representation space. To enhance data availability while preserving privacy, the study proposes a synthetic data generation approach that integrates adversarial generative models with the Synthetic Data Vault (SDV) framework. The fidelity, utility, and privacy-preserving properties of the generated data are rigorously validated through f-divergence metrics, distinguishability tests, TRTS/TSTR evaluations, and non-parametric statistical analyses. Experimental results demonstrate that the proposed method significantly improves the accuracy and generalization capability of intrusion detection models, thereby establishing a high-quality, reproducible foundation for cybersecurity research and evaluation.

Technology Category

Application Category

📝 Abstract
Supervised detection of network attacks has always been a critical part of network intrusion detection systems (NIDS). Nowadays, in a pivotal time for artificial intelligence (AI), with even more sophisticated attacks that utilize advanced techniques, such as generative artificial intelligence (GenAI) and reinforcement learning, it has become a vital component if we wish to protect our personal data, which are scattered across the web. In this paper, we address two tasks, in the first unified multi-modal NIDS dataset, which incorporates flow-level data, packet payload information and temporal contextual features, from the reprocessed CIC-IDS-2017, CIC-IoT-2023, UNSW-NB15 and CIC-DDoS-2019, with the same feature space. In the first task we use machine learning (ML) algorithms, with stratified cross validation, in order to prevent network attacks, with stability and reliability. In the second task we use adversarial learning algorithms to generate synthetic data, compare them with the real ones and evaluate their fidelity, utility and privacy using the SDV framework, f-divergences, distinguishability and non-parametric statistical tests. The findings provide stable ML models for intrusion detection and generative models with high fidelity and utility, by combining the Synthetic Data Vault framework, the TRTS and TSTR tests, with non-parametric statistical tests and f-divergence measures.
Problem

Research questions and friction points this paper is trying to address.

network attacks classification
intrusion detection
synthetic data generation
adversarial learning
machine learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-modal NIDS dataset
adversarial learning
synthetic data generation
f-divergence
non-parametric statistical tests
🔎 Similar Papers
No similar papers found.
I
Iakovos-Christos Zarkadis
University of Piraeus, Athens, Greece
Christos Douligeris
Christos Douligeris
Professor
Computer Networks