Waiting at the front door: Continuous monitoring of latency in the host network stack

📅 2026-06-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

233K/year
🤖 AI Summary
In the era of sub-millisecond networking, host-side latencies—such as those introduced by the kernel network stack and application scheduling—have become the dominant bottleneck for end-to-end low-latency performance, yet production environments lack effective means for continuous monitoring. This work proposes and implements netstacklat, the first system to enable low-overhead, continuous end-to-end latency monitoring within the Linux kernel network stack. By leveraging lightweight kernel probes and an efficient performance monitoring framework, netstacklat accurately captures the data path latency from the network interface card to the application across 144 diverse Nginx/Apache HTTP workloads, incurring less than 6% overhead even at tail latencies. The tool has been successfully deployed across Cloudflare’s global CDN infrastructure, demonstrating its scalability and practical utility in real-world production settings.
📝 Abstract
With networking moving into the sub-millisecond latency domain, latency in the end host itself can become a significant barrier to achieving consistently low application latency. Both the physical interconnect between the network card and the CPU, the kernel network stack, and the scheduling of applications themselves can be considerable sources of latency. Previous work has studied host latency at various levels, yet there remains a lack of methods and tools to continuously monitor host latency in production. To remedy this, we present netstacklat, a monitoring tool that captures latency at several points in the host network, from the early parts of the Linux kernel network stack all the way until the application reads the data. We evaluate netstacklat in a testbed, demonstrating its ability to capture host latency across 144 variations of HTTP workloads for Nginx and Apache, while also showing how the low monitoring overhead does not inflate tail latency by more than 6%, where previous monitoring solutions increase it by over 100%. Furthermore, we share our initial findings from deploying netstacklat in Cloudflare's global CDN network.
Problem

Research questions and friction points this paper is trying to address.

host latency
network stack
latency monitoring
sub-millisecond latency
production systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

host latency monitoring
network stack
tail latency
low-overhead instrumentation
production deployment
🔎 Similar Papers
No similar papers found.