PIDSMaker: Building and Evaluating Provenance-based Intrusion Detection Systems

๐Ÿ“… 2026-01-30
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenges in reproducibility, unfair comparison, and high reimplementation costs plaguing existing provenance graphโ€“based intrusion detection systems (PIDS), which stem from the absence of a unified evaluation framework. To this end, we propose PIDSMaker, an open-source framework that enables standardized evaluation of PIDS for the first time. PIDSMaker integrates eight state-of-the-art methods under a consistent pipeline featuring uniform data preprocessing, labeling conventions, and evaluation protocols. Its modular architecture, driven by YAML configuration files, facilitates code-free component composition, rapid prototyping, and ablation studies. The framework is further augmented with visualization tools and publicly released preprocessed datasets. Collectively, these contributions significantly enhance the reproducibility, evaluation efficiency, and fairness of PIDS research.

Technology Category

Application Category

๐Ÿ“ Abstract
Recent provenance-based intrusion detection systems (PIDSs) have demonstrated strong potential for detecting advanced persistent threats (APTs) by applying machine learning to system provenance graphs. However, evaluating and comparing PIDSs remains difficult: prior work uses inconsistent preprocessing pipelines, non-standard dataset splits, and incompatible ground-truth labeling and metrics. These discrepancies undermine reproducibility, impede fair comparison, and impose substantial re-implementation overhead on researchers. We present PIDSMaker, an open-source framework for developing and evaluating PIDSs under consistent protocols. PIDSMaker consolidates eight state-of-the-art systems into a modular, extensible architecture with standardized preprocessing and ground-truth labels, enabling consistent experiments and apples-to-apples comparisons. A YAML-based configuration interface supports rapid prototyping by composing components across systems without code changes. PIDSMaker also includes utilities for ablation studies, hyperparameter tuning, multi-run instability measurement, and visualization, addressing methodological gaps identified in prior work. We demonstrate PIDSMaker through concrete use cases and release it with preprocessed datasets and labels to support shared evaluation for the PIDS community.
Problem

Research questions and friction points this paper is trying to address.

provenance-based intrusion detection
evaluation framework
reproducibility
fair comparison
standardization
Innovation

Methods, ideas, or system contributions that make the work stand out.

provenance-based intrusion detection
standardized evaluation framework
modular architecture
YAML-based configuration
reproducible research
๐Ÿ”Ž Similar Papers
No similar papers found.
T
Tristan Bilot
University of British Columbia
B
Baoxiang Jiang
Xiโ€™an Jiaotong University
Thomas Pasquier
Thomas Pasquier
Assistant Professor, University of British Columbia
Computer SystemsComputer SecurityProvenanceIntrusion Detection