ProHunter: A Comprehensive APT Hunting System Based on Whole-System Provenance

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes an efficient and precise APT hunting system to address the challenges posed by advanced persistent threats, which are notoriously difficult to detect due to their stealthiness and persistence. Existing provenance-graph-based approaches suffer from high memory overhead, coarse-grained attack behavior segmentation, and a semantic gap between cyber threat intelligence (CTI) and system logs. To overcome these limitations, the proposed method compresses whole-system provenance graphs through semantic abstraction and bit-level hierarchical encoding, employs a heuristic threat graph sampling mechanism to accurately extract attack subgraphs, and introduces adaptive graph representation learning to achieve cross-semantic alignment between CTI and provenance data. Experiments on real-world datasets from DARPA TC E3/E5 and OpTC demonstrate that the approach significantly outperforms state-of-the-art methods in both detection accuracy and computational efficiency.

Technology Category

Application Category

📝 Abstract
Advanced Persistent Threats (APTs) remain difficult to detect due to their stealthy nature and long-term persistence. To tackle this challenge, provenance-based threat hunting has gained traction as a proactive defense mechanism. This technique models audit logs as a whole-system provenance graph and searches for subgraphs that match APT patterns recorded in Cyber Threat Intelligence (CTI) reports. However, several limitations persist: 1) significant memory and time overhead due to the extremely large provenance graphs; 2) imprecise segmentation of APT activities from provenance graphs due to their intricate entanglement with benign operations; and 3) poor alignment of attack representations between CTI-derived query graphs and provenance graphs due to their substantial semantic gaps. To address these limitations, this paper presents ProHunter, an efficient and accurate provenance-based APT hunting system with a platform-independent design. To minimize system overhead, ProHunter creates a compact data structure that efficiently stores long-term provenance graphs using semantic abstraction and bit-level hierarchical encoding strategies. To segment APT behaviors, a heuristic-driven threat graph sampling algorithm is designed, which can extract precise attack patterns from provenance graphs. Furthermore, to bridge the semantic gaps between CTI-derived graphs and provenance graphs, ProHunter proposes adaptive graph representation and feature enhancement methods, enabling the extraction of consistent attack semantics at both localized and globalized levels.Extensive evaluations on real-world APT campaigns from DARPA TC E3, E5 and OpTC datasets demonstrate that ProHunter outperforms state-of-the-art threat hunting systems in terms of efficiency and accuracy. Our code is available at https://github.com/xueboQiu/ProHunter.
Problem

Research questions and friction points this paper is trying to address.

Advanced Persistent Threats
provenance graph
threat hunting
semantic gap
graph segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

provenance-based threat hunting
semantic abstraction
graph sampling
adaptive graph representation
APT detection
🔎 Similar Papers
No similar papers found.
X
Xuebo Qiu
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China
M
Mingqi Lv
College of Geoinformatics, Zhejiang University of Technology, Hangzhou, 310023, China
Y
Yimei Zhang
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China
Tiantian Zhu
Tiantian Zhu
Zhejiang University of Technology
Mobile SecuritySystem SecurityArtificial Intelligence
T
Tieming Chen
College of Geoinformatics, Zhejiang University of Technology, Hangzhou, 310023, China