Malware Detection in Docker Containers: An Image is Worth a Thousand Logs

📅 2025-04-04

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Detecting malware injection in containerized environments remains challenging, particularly due to the limited effectiveness of traditional log- and behavior-based approaches against obfuscated or polymorphic samples. Method: This paper proposes a vision-driven, end-to-end detection framework: it encodes Docker container tarballs into large RGB images for the first time and introduces COSOCO—the first publicly available container image dataset. A streaming block-wise CNN architecture, built upon ResNet and VGG variants, is designed for efficient feature learning directly from raw binary structure visualizations. Contribution/Results: By shifting from dynamic analysis to static visual representation of low-level binary layouts, the method establishes a novel security risk modeling paradigm. Evaluated on COSOCO, it achieves significantly higher F1-score and recall than both individual and ensemble VirusTotal engines, improving detection rate by 12.7%. This work sets a new benchmark for containerized malware detection.

Technology Category

Application Category

📝 Abstract

Malware detection is increasingly challenged by evolving techniques like obfuscation and polymorphism, limiting the effectiveness of traditional methods. Meanwhile, the widespread adoption of software containers has introduced new security challenges, including the growing threat of malicious software injection, where a container, once compromised, can serve as entry point for further cyberattacks. In this work, we address these security issues by introducing a method to identify compromised containers through machine learning analysis of their file systems. We cast the entire software containers into large RGB images via their tarball representations, and propose to use established Convolutional Neural Network architectures on a streaming, patch-based manner. To support our experiments, we release the COSOCO dataset--the first of its kind--containing 3364 large-scale RGB images of benign and compromised software containers at https://huggingface.co/datasets/k3ylabs/cosoco-image-dataset. Our method detects more malware and achieves higher F1 and Recall scores than all individual and ensembles of VirusTotal engines, demonstrating its effectiveness and setting a new standard for identifying malware-compromised software containers.

Problem

Research questions and friction points this paper is trying to address.

Detect malware in Docker containers using image analysis

Address security challenges from malicious software injection

Improve detection accuracy beyond traditional VirusTotal methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Convert containers to RGB images for analysis

Use CNN on patch-based streaming data

Release COSOCO dataset for malware detection

🔎 Similar Papers

Preliminary study on artificial intelligence methods for cybersecurity threat detection in computer networks based on raw data packets

2024-07-24arXiv.orgCitations: 0

💼 Related Jobs

Machine Learning Engineer