Improving the Identification of Real-world Malware's DNS Covert Channels Using Locality Sensitive Hashing

๐Ÿ“… 2025-11-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Addressing the challenges of low accuracy and poor generalizability in DNS covert channel malware family identification, this paper proposes a subdomain sequence analysis method based on Locality-Sensitive Hashing (LSH). First, subdomain sequences from DNS queries are mapped to LSH fingerprints to capture their statistical similarity. Subsequently, robust sequential features are extracted and fed into a Random Forest classifier for malware family classification and behavioral pattern recognition. To the best of our knowledge, this is the first work to apply LSH to DNS covert channel detection, significantly enhancing detection capability against previously unseen or obfuscated malware variants. Experimental results demonstrate that the proposed method achieves higher detection accuracy and lower false positive rates compared to state-of-the-art approaches, while exhibiting superior generalizability and robustness under domain shifts and query perturbations.

Technology Category

Application Category

๐Ÿ“ Abstract
Nowadays, malware increasingly uses DNS-based covert channels in order to evade detection and maintain stealthy communication with its command-and-control servers. While prior work has focused on detecting such activity, identifying specific malware families and their behaviors from captured network traffic remains challenging due to the variability of DNS. In this paper, we present the first application of Locality Sensitive Hashing to the detection and identification of real-world malware utilizing DNS covert channels. Our approach encodes DNS subdomain sequences into statistical similarity features that effectively capture anomalies indicative of malicious activity. Combined with a Random Forest classifier, our method achieves higher accuracy and reduced false positive rates than prior approaches, while demonstrating improved robustness and generalization to previously unseen or modified malware samples. We further demonstrate that our approach enables reliable classification of malware behavior (e.g., uploading or downloading of files), based solely on DNS subdomains.
Problem

Research questions and friction points this paper is trying to address.

Identifying malware families using DNS covert channels remains challenging
Detecting malicious DNS activity with improved accuracy and reduced false positives
Classifying malware behaviors based solely on DNS subdomain sequences
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Locality Sensitive Hashing for DNS analysis
Encodes subdomain sequences into similarity features
Combines with Random Forest for improved classification
๐Ÿ”Ž Similar Papers
No similar papers found.
P
Pascal Ruffing
Center for Research and Technology (ZFT), University of Applied Sciences Worms, 67549 Worms, Germany
D
Denis Petrov
Center for Research and Technology (ZFT), University of Applied Sciences Worms, 67549 Worms, Germany, and also with the Institute of Information Resource Management (IRM), Ulm University, 89081 Ulm, Germany
Sebastian Zillien
Sebastian Zillien
Universitรคt Ulm
Covert ChannelsCyber SecurityAnomaly Detection
Steffen Wendzel
Steffen Wendzel
University of Ulm
Covert ChannelsInformation HidingInternet CensorshipCensorship CircumventionBibliometrics