🤖 AI Summary
To address the evasion of conventional detection by intermittent ransomware (e.g., BlackCat), this work conducts a byte-level empirical study, systematically analyzing the statistical impact of partial encryption on diverse file structures and constructing the first benchmark dataset for intermittent encryption detection. We propose a hybrid probabilistic model based on KL divergence to theoretically characterize the detectability upper bounds across file formats. Furthermore, we design a block-level convolutional neural network (Block-CNN) that enables fine-grained modeling of local structural anomalies under realistic ransomware configurations. Experiments demonstrate that Block-CNN significantly outperforms global detection methods, achieving an average accuracy improvement of 12.7% on mainstream formats—including PDF, DOCX, and JPEG—while exhibiting strong robustness and cross-format generalization. This work establishes a reproducible theoretical framework and an efficient, practical solution for intermittent encryption detection.
📝 Abstract
File encrypting ransomware increasingly employs intermittent encryption techniques, encrypting only parts of files to evade classical detection methods. These strategies, exemplified by ransomware families like BlackCat, complicate file structure based detection techniques due to diverse file formats exhibiting varying traits under partial encryption. This paper provides a systematic empirical characterization of byte level statistics under intermittent encryption across common file types, establishing a comprehensive baseline of how partial encryption impacts data structure. We specialize a classical KL divergence upper bound on a tailored mixture model of intermittent encryption, yielding filetype specific detectability ceilings for histogram-based detectors. Leveraging insights from this analysis, we empirically evaluate convolutional neural network (CNN) based detection methods using realistic intermittent encryption configurations derived from leading ransomware variants. Our findings demonstrate that localized analysis via chunk level CNNs consistently outperforms global analysis methods, highlighting their practical effectiveness and establishing a robust baseline for future detection systems.