Beyond Thread States: Diagnosing Performance Degradation with eBPF and Thread Dynamics

📅 2026-05-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

188K/year
🤖 AI Summary
This work addresses performance degradation in online data-intensive applications, which often stems from workload fluctuations and resource contention but remains elusive to conventional thread-state analysis due to complex cross-thread dependencies. The authors propose an application-agnostic diagnostic approach that leverages eBPF to collect 16 fine-grained metrics across six kernel subsystems—scheduling, VFS, networking, futex, multiplexed I/O, and block device I/O—and integrates a selective thread-tracing algorithm to precisely trace from entry-point threads to bottlenecked resources. By jointly modeling thread dynamics and resource interaction patterns, the method uniquely captures the propagation pathways of performance degradation. It enables low-overhead diagnosis of CPU, disk, lock, and external service contention across diverse workloads while uncovering internal application bottlenecks.
📝 Abstract
Online Data-Intensive applications face performance degradation from load variability and resource interference. While Thread State Analysis (TSA) based approaches enable identifying constrained subsystems, they lack the granularity to reveal the inter-thread dependencies that propagate degradation. In this paper, we present an application-agnostic performance degradation analysis method that extends TSA by capturing fine-grained thread dynamics. We implemented $16$ eBPF-based metrics across six kernel subsystems, including scheduling, VFS, networking, futex, multiplexing IO, and block IO which enables tracing thread interactions with specific resources like futexes, sockets, and disks. Our method leverages the fact that performance degradation propagates along inter-thread dependencies, and a subset of thread-resource interactions can enable capturing common degradation patterns. To this end, we employ a selective thread tracking algorithm that traces performance issues from entry-point threads to constrained resources. Experimentation with diverse applications under variable workloads and resource contention shows our method successfully diagnoses CPU, disk, lock, and external service contention with minimal overhead, while also revealing internal application constraints.
Problem

Research questions and friction points this paper is trying to address.

performance degradation
thread dynamics
inter-thread dependencies
resource contention
online data-intensive applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

eBPF
thread dynamics
performance degradation diagnosis
inter-thread dependencies
resource contention
🔎 Similar Papers