A Comparison of Selected Image Transformation Techniques for Malware Classification

📅 2025-09-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing malware image representation methods inadequately balance executable file structural characteristics with compatibility for image-based deep learning analysis. Method: This paper systematically evaluates eight mainstream byte-to-image conversion strategies—including grayscale mapping, entropy visualization, and RGB encoding—across six deep learning architectures (CNN, ResNet, ViT, etc.), conducting controlled cross-method classification experiments on large-scale malware datasets. Contribution/Results: Empirical results show minimal performance variation (<1.2% in accuracy and F1-score) across conversion methods, while model architecture choice exerts significantly greater influence on classification performance. The study validates the inherent efficacy and robustness of the image-based analysis paradigm itself—not any specific visualization scheme. Crucially, this work provides the first large-scale controlled evidence that, for malware image classification, *how to see* (i.e., model selection) is fundamentally more decisive than *what to see* (i.e., byte-to-image mapping). These findings establish an empirical benchmark and theoretical foundation for methodological design in malware image analysis.

Technology Category

Application Category

📝 Abstract
Recently, a considerable amount of malware research has focused on the use of powerful image-based machine learning techniques, which generally yield impressive results. However, before image-based techniques can be applied to malware, the samples must be converted to images, and there is no generally-accepted approach for doing so. The malware-to-image conversion strategies found in the literature often appear to be ad hoc, with little or no effort made to take into account properties of executable files. In this paper, we experiment with eight distinct malware-to-image conversion techniques, and for each, we test a variety of learning models. We find that several of these image conversion techniques perform similarly across a range of learning models, in spite of the image conversion processes being quite different. These results suggest that the effectiveness of image-based malware classification techniques may depend more on the inherent strengths of image analysis techniques, as opposed to the precise details of the image conversion strategy.
Problem

Research questions and friction points this paper is trying to address.

Comparing malware-to-image conversion techniques for classification
Evaluating performance across different machine learning models
Assessing impact of image transformation on malware analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Eight malware-to-image conversion techniques tested
Various learning models applied for classification
Effectiveness depends on image analysis strengths
🔎 Similar Papers
No similar papers found.
R
Rishit Agrawal
Department of Computer Science, San Jose State University
K
Kunal Bhatnagar
Department of Computer Science, San Jose State University
A
Andrew Do
Department of Computer Science, San Jose State University
R
Ronnit Rana
Department of Computer Science, San Jose State University
Mark Stamp
Mark Stamp
Professor of Computer Science, San Jose State University
information securitycryptographymachine learning