🤖 AI Summary
In computational genomics, limited on-device memory (only 8 GB HBM) impedes efficient random access to terabyte-scale genomic datasets. Method: This paper introduces Bancroft, an FPGA-accelerated platform (Xilinx Alveo U50) that enables PCIe-bandwidth-aware real-time compression/decompression of genomic data. It pioneers fixed-stride Cuckoo hashing for exact-match acceleration and group-header encoding to construct a logically “infinite-capacity” virtual memory abstraction. Memory access is optimized via synergistic HBM/DDR4 co-management. Results: Bancroft achieves 30% of HBM’s peak PCIe bandwidth—10× higher than conventional PCIe-based architectures. For pre-alignment filtering, it delivers 6.2× speedup over baseline, attaining 30% of the peak throughput of HBM-only accelerators and 90% of DDR4-based accelerators.
📝 Abstract
This paper presents Bancroft, a computational genomics acceleration platform that provides the illusion of practically infinite on-device memory capacity by compressing genomic data movement over PCIe. Bancroft introduces novel optimizations for efficient accelerator implementation to reference-based genome compression, including fixed-stride matching using cuckoo hashes and grouped header encoding, incorporated into a familiar interface supporting random accesses. We evaluate a prototype implementation of Bancroft on an affordable Alveo U50 FPGA equipped with 8 GB of HBM. Thanks to the orders of magnitude improvements in performance and resource efficiency of genomic compression, our prototype provides access to TBs of host-side genomic data at memory-class performance, measuring speeds over 30% of the on-device HBM bandwidth, an order of magnitude higher than conventional PCIe-limited architectures. Using a real-world pre-alignment filtering application, Bancroft demonstrates over 6x improvement over the conventional PCIe-attached architecture, achieving 30% of peak internal throughput of an accelerator with HBM, and 90% of the one with DDR4. Bancroft supports memory-class performance to practically infinite data capacity, using a small, fixed amount of HBM, making it an attractive solution to continued future scalability of computational genomics.