🤖 AI Summary
This work addresses the inefficiency and suboptimal energy consumption of data aggregation operations on heterogeneous hardware platforms. To bridge this gap, the authors propose a hybrid hardware acceleration framework that synergistically combines unified abstractions with platform-specific optimizations, effectively balancing programmability, portability, and architectural specialization across CPUs, GPUs, and FPGAs. By introducing a common abstraction layer while incorporating tailored optimization strategies for each hardware target, the approach achieves significant improvements in both performance and energy efficiency across all three mainstream architectures. The evaluation demonstrates consistent gains not only in device-level computation but also in end-to-end processing metrics, thereby establishing an effective trade-off between generality and high performance for data aggregation workloads.
📝 Abstract
The high efficiency of domain-specific hardware has sparked substantial interest in adopting accelerators in data analytics systems. Among many choices, GPUs and FPGAs thrived as two popular solutions due to their prevalent deployments in cloud data centers. This paper investigates hardware acceleration solutions for aggregation, a critical data analytics operation. Specifically, we implement aggregation with a unified hardware acceleration framework, which trades efficiency for ease of programming and portability, and then further develop hardware-specific optimizations. We evaluate these solutions on three recent computing hardware platforms: a CPU, a GPU, and an FPGA, with metrics that cover both the performance and energy consumption of on-device and end-to-end processing.