🤖 AI Summary
To address insufficient classification accuracy for Windows PE malware, this paper proposes a multimodal static analysis method based on structural domain partitioning and probabilistic-level fusion. First, heterogeneous features are extracted from three structural domains—file header, section headers, and the entire file—leveraging PE format specifications. Second, three unimodal baseline models—SVM, LSTM, and CNN—are constructed. Third, outputs of nine header-section combination models are concatenated into high-order fused features to train an ensemble SVM classifier. This work is the first to systematically investigate collaborative modeling across multiple PE structural domains and introduces a lightweight probabilistic fusion strategy. The approach achieves significantly improved classification accuracy while maintaining low computational overhead, empirically validating the effectiveness and superiority of domain-specific representation learning coupled with decision-level fusion for malware detection.
📝 Abstract
The threat of malware is a serious concern for computer networks and systems, highlighting the need for accurate classification techniques. In this research, we experiment with multimodal machine learning approaches for malware classification, based on the structured nature of the Windows Portable Executable (PE) file format. Specifically, we train Support Vector Machine (SVM), Long Short-Term Memory (LSTM), and Convolutional Neural Network (CNN) models on features extracted from PE headers, we train these same models on features extracted from the other sections of PE files, and train each model on features extracted from the entire PE file. We then train SVM models on each of the nine header-sections combinations of these baseline models, using the output layer probabilities of the component models as feature vectors. We compare the baseline cases to these multimodal combinations. In our experiments, we find that the best of the multimodal models outperforms the best of the baseline cases, indicating that it can be advantageous to train separate models on distinct parts of Windows PE files.