CO-DEFEND: Continuous Decentralized Federated Learning for Secure DoH-Based Threat Detection

πŸ“… 2025-04-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the ineffectiveness of conventional detection methods against stealthy DNS-over-HTTPS (DoH) tunneling attacks, this paper proposes a decentralized federated learning (FL) framework enabling multiple organizations to collaboratively train a lightweight threat detection model without sharing raw DNS traffic. We introduce a novel continuous-decentralized FL mechanism that eliminates single-point failure by removing the central server. For the first time, we adapt support vector machines (SVM), logistic regression (LR), decision trees (DT), and random forests (RF) to dynamic, streaming DoH-based FL training, integrating session-level temporal and statistical features. Evaluated on the CIRA-CIC-DoHBrw-2020 dataset, our approach achieves 98.2% accuracy in malicious DoH tunnel detection, reduces communication overhead by 67%, improves Byzantine robustness by 3.1Γ—, and ensures sub-15 ms inference latency per DNS flow.

Technology Category

Application Category

πŸ“ Abstract
The use of DNS over HTTPS (DoH) tunneling by an attacker to hide malicious activity within encrypted DNS traffic poses a serious threat to network security, as it allows malicious actors to bypass traditional monitoring and intrusion detection systems while evading detection by conventional traffic analysis techniques. Machine Learning (ML) techniques can be used to detect DoH tunnels; however, their effectiveness relies on large datasets containing both benign and malicious traffic. Sharing such datasets across entities is challenging due to privacy concerns. In this work, we propose CO-DEFEND (Continuous Decentralized Federated Learning for Secure DoH-Based Threat Detection), a Decentralized Federated Learning (DFL) framework that enables multiple entities to collaboratively train a classification machine learning model while preserving data privacy and enhancing resilience against single points of failure. The proposed DFL framework, which is scalable and privacy-preserving, is based on a federation process that allows multiple entities to train online their local models using incoming DoH flows in real time as they are processed by the entity. In addition, we adapt four classical machine learning algorithms, Support Vector Machines (SVM), Logistic Regression (LR), Decision Trees (DT), and Random Forest (RF), for federated scenarios, comparing their results with more computationally complex alternatives such as neural networks. We compare our proposed method by using the dataset CIRA-CIC-DoHBrw-2020 with existing machine learning approaches to demonstrate its effectiveness in detecting malicious DoH tunnels and the benefits it brings.
Problem

Research questions and friction points this paper is trying to address.

Detect DoH tunneling attacks in encrypted DNS traffic
Enable collaborative ML training without sharing private data
Compare federated ML models for DoH threat detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized Federated Learning for privacy
Real-time online training with DoH flows
Adapted classical ML for federated scenarios
πŸ”Ž Similar Papers
No similar papers found.
D
Diego Cajaraville-Aboy
atlanTTic Research Center (I&C Lab), University of Vigo, Spain
M
Marta Moure-Garrido
Department of Telematic Engineering, University Carlos III of Madrid, Spain
C
Carlos Beis-Penedo
atlanTTic Research Center (I&C Lab), University of Vigo, Spain
C
C. GarcΓ­a-Rubio
Department of Telematic Engineering, University Carlos III of Madrid, Spain
R
Rebeca P. D'iaz-Redondo
atlanTTic Research Center (I&C Lab), University of Vigo, Spain
Celeste Campo
Celeste Campo
Associate Professor
Computer Networks
A
Ana Fern'andez-Vilas
atlanTTic Research Center (I&C Lab), University of Vigo, Spain
M
Manuel Fern'andez-Veiga
atlanTTic Research Center (I&C Lab), University of Vigo, Spain