Sorting-based FPGA Sliding Window Aggregation Engine without off-chip Memories

📅 2024-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-throughput streaming aggregation—particularly sliding-window aggregation (SWAG)—faces critical challenges in grouped and time-series analytics, including excessive hardware overhead, bottlenecks in hash-based state management, and heavy reliance on off-chip memory. To address these, this paper proposes a sorting-based FPGA pipeline architecture. It introduces the first DRAM-free, sorting-driven SWAG paradigm, integrating a hardware sorting network, adaptive group-level scheduling, and optimized on-chip memory mapping to jointly optimize resource utilization, throughput, and window capacity. Experimental evaluation demonstrates that the design achieves a 476× speedup over an optimized CPU implementation on the same platform, delivers 7.14× higher throughput than the state-of-the-art, supports windows four times larger, and significantly reduces FPGA resource consumption.

Technology Category

Application Category

📝 Abstract
Aggregation queries are a series of computationally-demanding analytics operations on grouped and time series data. They include tasks such as summation or finding the median among the items of a group sharing a group ID, and within a specified number of the last observed tuples for sliding window aggregation (SWAG). They have a wide range of applications including in database analytics, operating systems, bank security and medical sensors. Existing challenges include the hardware complexity that comes with efficiently handling per-group states using hash-based approaches. This paper presents a pipelined and adaptable approach for calculating a wide range of aggregation queries with high throughput. It is then adapted for SWAG to achieve up to 476x speedup over the CPU of the same platform. It outperforms the state-of-the-art such as by being able to process 7.14x more tuples per second, and support 4x the window sizes with a fraction of the resources and no DRAM.
Problem

Research questions and friction points this paper is trying to address.

Efficiently handling high-throughput streaming aggregation queries
Overcoming hardware complexity in hash-based per-group state management
Achieving adaptable performance for sliding window aggregation operations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptable pipeline for high-throughput aggregation queries
Achieves 476x speedup over CPU for sliding window aggregation
Handles large window sizes with minimal resource usage
🔎 Similar Papers
No similar papers found.
P
Philippos Papaphilippou
School of Computer Science and Statistics, Trinity College Dublin, Ireland
Wayne Luk
Wayne Luk
Professor of Computer Engineering, Imperial College London
Hardware and ArchitectutreReconfigurable ComputingDesign Automation