🤖 AI Summary
To address high false positives, catastrophic forgetting, and traceability challenges in APT detection caused by concept drift, this paper proposes the first provenance-aware intrusion detection system designed for lifelong learning. Methodologically, it introduces a novel fourfold synergistic mechanism: (1) pseudo-edge connections to mitigate catastrophic forgetting; (2) suspicious state transition blocking to prevent malicious pattern acquisition; (3) path-level node filtering to enhance alert precision; and (4) micro-graph construction to support attack process reconstruction. The approach integrates incremental graph neural networks, system-call provenance graph modeling, dynamic pseudo-edge augmentation, and subgraph reconstruction. Evaluated on mainstream benchmarks, it achieves 30%, 54%, and 29% improvements in window-level, graph-level, and node-level detection accuracy, respectively. It significantly suppresses concept-drift-induced false positives and enables dynamic modeling, precise alerting, and interpretable provenance tracing over continuous streaming data.
📝 Abstract
As Advanced Persistent Threat (APT) complexity increases, provenance data is increasingly used for detection. Anomaly-based systems are gaining attention due to their attack-knowledge-agnostic nature and ability to counter zero-day vulnerabilities. However, traditional detection paradigms, which train on offline, limited-size data, often overlook concept drift - unpredictable changes in streaming data distribution over time. This leads to high false positive rates. We propose incremental learning as a new paradigm to mitigate this issue. However, we identify FOUR CHALLENGES while integrating incremental learning as a new paradigm. First, the long-running incremental system must combat catastrophic forgetting (C1) and avoid learning malicious behaviors (C2). Then, the system needs to achieve precise alerts (C3) and reconstruct attack scenarios (C4). We present METANOIA, the first lifelong detection system that mitigates the high false positives due to concept drift. It connects pseudo edges to combat catastrophic forgetting, transfers suspicious states to avoid learning malicious behaviors, filters nodes at the path-level to achieve precise alerts, and constructs mini-graphs to reconstruct attack scenarios. Using state-of-the-art benchmarks, we demonstrate that METANOIA improves precision performance at the window-level, graph-level, and node-level by 30%, 54%, and 29%, respectively, compared to previous approaches.