🤖 AI Summary
In the era of sub-millisecond networking, host-side latencies—such as those introduced by the kernel network stack and application scheduling—have become the dominant bottleneck for end-to-end low-latency performance, yet production environments lack effective means for continuous monitoring. This work proposes and implements netstacklat, the first system to enable low-overhead, continuous end-to-end latency monitoring within the Linux kernel network stack. By leveraging lightweight kernel probes and an efficient performance monitoring framework, netstacklat accurately captures the data path latency from the network interface card to the application across 144 diverse Nginx/Apache HTTP workloads, incurring less than 6% overhead even at tail latencies. The tool has been successfully deployed across Cloudflare’s global CDN infrastructure, demonstrating its scalability and practical utility in real-world production settings.
📝 Abstract
With networking moving into the sub-millisecond latency domain, latency in the end host itself can become a significant barrier to achieving consistently low application latency. Both the physical interconnect between the network card and the CPU, the kernel network stack, and the scheduling of applications themselves can be considerable sources of latency. Previous work has studied host latency at various levels, yet there remains a lack of methods and tools to continuously monitor host latency in production. To remedy this, we present netstacklat, a monitoring tool that captures latency at several points in the host network, from the early parts of the Linux kernel network stack all the way until the application reads the data. We evaluate netstacklat in a testbed, demonstrating its ability to capture host latency across 144 variations of HTTP workloads for Nginx and Apache, while also showing how the low monitoring overhead does not inflate tail latency by more than 6%, where previous monitoring solutions increase it by over 100%. Furthermore, we share our initial findings from deploying netstacklat in Cloudflare's global CDN network.