Catalpa: GC for a Low-Variance Software Stack

📅 2025-09-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional performance optimization focuses on average latency, yet user experience is governed by tail latency (e.g., 95th/99th percentiles)—a binary judgment of whether responses are “fast enough.” To address this, we propose a low-latency-variance software-stack design centered on Catalpa, a novel garbage collector for the Bosque language. Catalpa leverages Bosque’s immutability and acyclic reference graph to eliminate read/write barriers, application-thread synchronization, and unbounded GC pauses. It employs lazy reclamation, fixed-overhead memory layout, and non-intrusive algorithms to remove high-variance synchronization costs inherent in conventional collectors. Experimental evaluation demonstrates that Catalpa achieves significantly reduced tail latency and latency jitter—while maintaining high throughput and low memory footprint—thereby enhancing system predictability and response consistency.

Technology Category

Application Category

📝 Abstract
The performance of an application/runtime is usually conceptualized as a continuous function where, the lower the amount of memory/time used on a given workload, then the better the compiler/runtime is. However, in practice, good performance of an application is viewed as more of a binary function - either the application responds in under, say 100 ms, and is fast enough for a user to barely notice, or it takes a noticeable amount of time, leaving the user waiting and potentially abandoning the task. Thus, performance really means how often the application is fast enough to be usable, leading industrial developers to focus on the 95th and 99th percentile tail-latencies as heavily, or moreso, than average response time. Our vision is to create a software stack that actively supports these needs via programming language and runtime system design. In this paper we present a novel garbage-collector design, the Catalpa collector, for the Bosque programming language and runtime. This allocator is designed to minimize latency and variability while maintaining high-throughput and incurring small memory overheads. To achieve these goals we leverage various features of the Bosque language, including immutability and reference-cycle freedom, to construct a collector that has bounded collection pauses, incurs fixed-constant memory overheads, and does not require any barriers or synchronization with application code.
Problem

Research questions and friction points this paper is trying to address.

Minimizing garbage collection latency and variability
Achieving bounded collection pauses without synchronization
Maintaining high-throughput with small memory overheads
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bounded garbage collection pauses design
Leverages language immutability and cycle freedom
No barriers or synchronization with application
A
Anthony Arnold
Department of Computer Science, University of Kentucky
Mark Marron
Mark Marron
Microsoft Research
Program AnalysisProgram SynthesisTime-Travel Debugging