🤖 AI Summary
To address high contention and low throughput in concurrent FIFO queues caused by strict ordering constraints, this paper proposes two scalable queue designs based on relaxed FIFO semantics. First, a block-based multi-queue structure integrates MultiQueue-style sharding with circular buffers. Second, it introduces circular buffers into relaxed queues for the first time, enabling orthogonal concurrent access. Both designs alleviate synchronization bottlenecks via controlled, localized reordering of elements. Evaluated on microbenchmarks and real-world graph BFS workloads, the proposed queues significantly outperform state-of-the-art strict and relaxed FIFO queues—achieving up to 3.2× higher throughput under high concurrency and demonstrating superior linear scalability. This work establishes a novel design paradigm for high-performance concurrent data structures.
📝 Abstract
FIFO queues are a fundamental data structure used in a wide range of applications. Concurrent FIFO queues allow multiple execution threads to access the queue simultaneously. Maintaining strict FIFO semantics in concurrent queues leads to low throughput due to high contention at the head and tail of the queue. By relaxing the FIFO semantics to allow some reordering of elements, it becomes possible to achieve much higher scalability. This work presents two orthogonal designs for relaxed concurrent FIFO queues, one derived from the MultiQueue and the other based on ring buffers. We evaluate both designs extensively on various micro-benchmarks and a breadth-first search application on large graphs. Both designs outperform state-of-the-art relaxed and strict FIFO queues, achieving higher throughput and better scalability.