🤖 AI Summary
To address privacy leakage risks in passive network monitoring, this paper proposes the first differentially private query engine tailored for passive measurement scenarios. Built upon the Apache Spark distributed framework, the engine supports multiple network data formats—including PCAP and NetFlow—and enables flexible SQL-like queries with configurable privacy budgets (ε). We introduce a novel noise-injection mechanism and query optimization strategy specifically designed for the statistical characteristics of network traffic, significantly improving query accuracy while strictly guaranteeing (ε, δ)-differential privacy. Experimental evaluation on real backbone network traffic demonstrates that the system consistently delivers highly utility-preserving analytical results—such as traffic distribution and anomaly detection metrics—across ε ∈ [0.5, 2.0], with information leakage rigorously bounded. The approach thus achieves a principled balance between strong privacy guarantees and practical data analysis utility.
📝 Abstract
Passive monitoring is a network measurement technique which analyzes the traffic carried by an operational network. It has several applications for traffic engineering, Quality of Experience monitoring and cyber security. However, it entails the processing of personal information, thus, threatening users' privacy. In this work, we propose DPMon, a tool to run privacy-preserving queries to a dataset of passive network measurements. It exploits differential privacy to perturb the output of the query to preserve users' privacy. DPMon can exploit big data infrastructures running Apache Spark and operate on different data formats. We show that DPMon allows extracting meaningful insights from the data, while at the same time controlling the amount of disclosed information.