AlertBERT: A noise-robust alert grouping framework for simultaneous cyber attacks

📅 2026-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of conventional time-window-based alert grouping methods, which often lead to alert fatigue and misclassification under high-noise and multi-concurrent-attack scenarios. To overcome these challenges, the authors propose AlertBERT, a novel framework that leverages a masked language model (MLM) for self-supervised semantic encoding of alerts and introduces a controllable data augmentation strategy to simulate realistic noise and concurrent attacks, thereby circumventing rigid temporal constraints. Robust alert grouping is achieved through density-based clustering (e.g., DBSCAN), enabling both real-time and post-hoc analysis. Experimental results on synthetic datasets demonstrate that AlertBERT significantly outperforms existing approaches in accurately identifying alert clusters corresponding to genuine attacks.

Technology Category

Application Category

📝 Abstract
Automated detection of cyber attacks is a critical capability to counteract the growing volume and sophistication of cyber attacks. However, the high numbers of security alerts issued by intrusion detection systems lead to alert fatigue among analysts working in security operations centres (SOC), which in turn causes slow reaction time and incorrect decision making. Alert grouping, which refers to clustering of security alerts according to their underlying causes, can significantly reduce the number of distinct items analysts have to consider. Unfortunately, conventional time-based alert grouping solutions are unsuitable for large scale computer networks characterised by high levels of false positive alerts and simultaneously occurring attacks. To address these limitations, we propose AlertBERT, a self-supervised framework designed to group alerts from isolated or concurrent attacks in noisy environments. Thereby, our open-source implementation of AlertBERT leverages masked-language-models and density-based clustering to support both real-time or forensic operation. To evaluate our framework, we further introduce a novel data augmentation method that enables flexible control over noise levels and simulates concurrent attack occurrences. Based on the data sets generated through this method, we demonstrate that AlertBERT consistently outperforms conventional time-based grouping techniques, achieving superior accuracy in identifying correct alert groups.
Problem

Research questions and friction points this paper is trying to address.

alert grouping
cyber attacks
alert fatigue
false positives
concurrent attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

AlertBERT
self-supervised learning
alert grouping
masked language model
density-based clustering
🔎 Similar Papers
No similar papers found.