A Novel Approach to Network Traffic Analysis: the HERA tool

📅 2025-01-13

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Existing network traffic analysis tools (e.g., CICFlowMeter) suffer from critical limitations in flow delineation, feature extraction, and label consistency, undermining the reliability and reproducibility of intrusion detection systems (IDS). To address these issues, we propose HERA—a lightweight, open-source, end-to-end traffic processing framework. HERA is the first tool to support configurable feature sets and fine-grained flow labeling, integrating NetFlow/IPFIX parsing, customizable feature engineering, and flexible label mapping. Implemented in Python, it natively supports standard datasets such as UNSW-NB15. Experimental evaluation on UNSW-NB15 demonstrates >99.8% flow generation accuracy and 100% label consistency across all flows. HERA significantly enhances traffic data fidelity, usability, and extensibility, thereby establishing a high-fidelity, reproducible foundation for IDS research and development.

Technology Category

Application Category

📝 Abstract

Cybersecurity threats highlight the need for robust network intrusion detection systems to identify malicious behaviour. These systems rely heavily on large datasets to train machine learning models capable of detecting patterns and predicting threats. In the past two decades, researchers have produced a multitude of datasets, however, some widely utilised recent datasets generated with CICFlowMeter contain inaccuracies. These result in flow generation and feature extraction inconsistencies, leading to skewed results and reduced system effectiveness. Other tools in this context lack ease of use, customizable feature sets, and flow labelling options. In this work, we introduce HERA, a new open-source tool that generates flow files and labelled or unlabelled datasets with user-defined features. Validated and tested with the UNSW-NB15 dataset, HERA demonstrated accurate flow and label generation.

Problem

Research questions and friction points this paper is trying to address.

Network Traffic Analysis

Accuracy Issues

Feature Selection Limitations

Innovation

Methods, ideas, or system contributions that make the work stand out.

HERA

Traffic Generation

Label Accuracy

🔎 Similar Papers

No similar papers found.