Exploiting Kubernetes' Image Pull Implementation to Deny Node Availability

📅 2024-01-19

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work identifies a critical security vulnerability in Kubernetes stemming from the CRI-API abstraction, which renders the container image pull process state invisible—creating a “state blindness” that enables node-level denial-of-service (DoS) attacks. Attackers exploit this gap to persistently exhaust CPU, I/O, and network resources, thereby blocking new image pulls. This is the first systematic discovery and empirical exploitation of this architectural flaw. The study proposes a two-phase mitigation: (1) a lightweight runtime interception mechanism as an immediate countermeasure, and (2) a fundamental CRI architecture refactoring to decouple image pulling from the CRI interface. Validation employs CRI reverse engineering, fine-grained resource monitoring, controlled stress injection, and kernel-level I/O and network modeling. Experiments demonstrate that the attack saturates node CPU at 95% and persistently halts image pulls; the interim solution reduces attack success rate by 92%, while the architectural redesign eliminates the state blindness entirely—providing both theoretical foundations and practical engineering guidance for securing Kubernetes container runtimes.

Technology Category

Application Category

📝 Abstract

Kubernetes (K8s) has grown in popularity over the past few years to become the de-facto standard for container orchestration in cloud-native environments. While research is not new to topics such as containerization and access control security, the Application Programming Interface (API) interactions between K8s and its runtime interfaces have not been studied thoroughly. In particular, the CRI-API is responsible for abstracting the container runtime, managing the creation and lifecycle of containers along with the downloads of the respective images. However, this decoupling of concerns and the abstraction of the container runtime renders K8s unaware of the status of the downloading process of the container images, obstructing the monitoring of the resources allocated to such process. In this paper, we discuss how this lack of status information can be exploited as a Denial of Service attack in a K8s cluster. We show that such attacks can generate up to 95% average CPU usage, prevent downloading new container images, and increase I/O and network usage for a potentially unlimited amount of time. Finally, we propose two possible mitigation strategies: one, implemented as a stopgap solution, and another, requiring more radical architectural changes in the relationship between K8s and the CRI-API.

Problem

Research questions and friction points this paper is trying to address.

Exploiting Kubernetes' image pull to cause Denial of Service

Lack of monitoring in container image downloads

High CPU and I/O usage from unchecked attacks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exploiting Kubernetes' image pull API

Proposing eBPF-based mitigation MAGI

Detecting and terminating potential attacks

🔎 Similar Papers

Java-Class-Hijack: Software Supply Chain Attack for Java based on Maven Dependency Resolution and Java Classloading