🤖 AI Summary
Ransomware evades conventional detection via entropy-reduction techniques—such as Base64 obfuscation and intermittent/partial encryption—challenging static analysis. Method: This paper proposes an online incremental learning framework for real-time prediction of file encryption behavior. It systematically evaluates incremental learning’s adaptability against evolving ransomware, introduces a collaborative detection paradigm with encryption-mode-specific classifiers (AES-Base64 vs. intermittent), integrates Hoeffding Trees with warm-start random forests, and designs multi-format file feature engineering (entropy, byte distribution, structural offsets). Contribution/Results: Evaluated on 32.6 GB of real-world samples spanning 75 ransomware families, the framework achieves 98.2% accuracy for AES-Base64 encryption detection using Hoeffding Trees and 96.7% F1-score for intermittent encryption using random forests—substantially outperforming static models while enabling resource-efficient, real-time edge deployment.
📝 Abstract
In the rapidly evolving landscape of cybersecurity threats, ransomware represents a significant challenge. Attackers increasingly employ sophisticated encryption methods, such as entropy reduction through Base64 encoding, and partial or intermittent encryption to evade traditional detection methods. This study explores the dynamic battle between adversaries who continuously refine encryption strategies and defenders developing advanced countermeasures to protect vulnerable data. We investigate the application of online incremental machine learning algorithms designed to predict file encryption activities despite adversaries evolving obfuscation techniques. Our analysis utilizes an extensive dataset of 32.6 GB, comprising 11,928 files across multiple formats, including Microsoft Word documents (doc), PowerPoint presentations (ppt), Excel spreadsheets (xlsx), image formats (jpg, jpeg, png, tif, gif), PDFs (pdf), audio (mp3), and video (mp4) files. These files were encrypted by 75 distinct ransomware families, facilitating a robust empirical evaluation of machine learning classifiers effectiveness against diverse encryption tactics. Results highlight the Hoeffding Tree algorithms superior incremental learning capability, particularly effective in detecting traditional and AES-Base64 encryption methods employed to lower entropy. Conversely, the Random Forest classifier with warm-start functionality excels at identifying intermittent encryption methods, demonstrating the necessity of tailored machine learning solutions to counter sophisticated ransomware strategies.