🤖 AI Summary
This work addresses the significant performance overhead in service meshes caused by frequent context switches due to sidecar proxies. It proposes L7FP, the first fast-path framework that automatically compiles high-level Layer 7 policies—such as HTTP/2 and TLS—into eBPF programs and offloads their execution into the kernel, accelerating microservice communication without requiring application code modifications. For policies unsupported in-kernel, L7FP transparently falls back to user-space proxying. This approach achieves, for the first time, automated synthesis and execution of Layer 7 policies entirely within kernel space while preserving full policy compatibility. Experimental results demonstrate substantial performance gains: compared to existing service meshes, L7FP reduces median request latency by up to 6× and improves throughput by up to 3×.
📝 Abstract
Service meshes have recently emerged as the de-facto standard for deploying microservices. Conceptually, they provide a uniform abstraction for inter-process communication (IPC) between services by implementing common networking mechanisms -- such as encryption, routing, and load balancing -- and by allowing these mechanisms to be configured and composed through high-level policies. Supporting these policies, however, comes with a significant performance cost, since service meshes interpose proxies (``sidecars'') on the data path, leading to numerous context switches.
This paper presents L7FP, a fast path for service meshes which can enforce the vast majority of application-layer policies seen in the wild directly in kernel space. Given high-level policies, L7FP automatically synthesizes an eBPF-based data plane which enforces them in the kernel. L7FP accelerates existing microservices without any code modification, and transparently falls back to existing service proxies (the slow path) for the few unsupported policies.
We fully implemented L7FP, with support for both TLS and HTTP/2. Compared to state-of-the-art service meshes, L7FP reduces the median request latency of realistic applications by up to $6\times$ while sustaining $3\times$ more throughput.