A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms

📅 2023-06-27
🏛️ arXiv.org
📈 Citations: 29
✨ Influential: 0
📄 PDF
🤖 AI Summary
Selecting and optimizing deep learning (DL) accelerators for high-performance computing (HPC) environments has become increasingly critical due to growing demands for AI–HPC convergence. Method: This paper presents a systematic survey of mainstream and emerging hardware acceleration technologies from 2019 to 2024, covering GPUs/TPUs, FPGAs/ASICs, RISC-V co-processors, in-memory computing (3D-stacked resistive/phase-change memory), neuromorphic processors, chiplet-based packaging, photonic, and quantum accelerators. We propose the first cross-architectural classification framework integrating classical and frontier paradigms to clarify technological evolution trajectories and HPC-specific bottlenecks. Contribution/Results: Based on analysis of over 100 representative works, we construct a high-impact DL accelerator technology landscape, deliver scalable design guidelines for heterogeneous HPC platforms, and introduce quantitative benchmarking criteria for accelerator selection—thereby enabling principled, performance-aware integration of AI and HPC.
📝 Abstract
Recent trends in deep learning (DL) imposed hardware accelerators as the most viable solution for several classes of high-performance computing (HPC) applications such as image classification, computer vision, and speech recognition. This survey summarizes and classifies the most recent advances in designing DL accelerators suitable to reach the performance requirements of HPC applications. In particular, it highlights the most advanced approaches to support deep learning accelerations including not only GPU and TPU-based accelerators but also design-specific hardware accelerators such as FPGA-based and ASIC-based accelerators, Neural Processing Units, open hardware RISC-V-based accelerators and co-processors. The survey also describes accelerators based on emerging memory technologies and computing paradigms, such as 3D-stacked Processor-In-Memory, non-volatile memories (mainly, Resistive RAM and Phase Change Memories) to implement in-memory computing, Neuromorphic Processing Units, and accelerators based on Multi-Chip Modules. Among emerging technologies, we also include some insights into quantum-based accelerators and photonics. To conclude, the survey classifies the most influential architectures and technologies proposed in the last years, with the purpose of offering the reader a comprehensive perspective in the rapidly evolving field of deep learning.
Problem

Research questions and friction points this paper is trying to address.

Survey recent DL accelerators for HPC platforms
Explore GPU, TPU, FPGA, ASIC, and emerging hardware
Categorize architectures and technologies in DL acceleration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Survey of GPU and TPU based DL accelerators
Explore FPGA ASIC and RISC V accelerators
Cover emerging memory and quantum accelerators
🔎 Similar Papers
No similar papers found.
Cristina Silvano
Cristina Silvano
Professor of Computer Architecture, Politecnico di Milano, IEEE Fellow
Computer ArchitectureDesign AutomationDesign Space ExplorationEnergy-Aware Computing
Daniele Ielmini
Daniele Ielmini
Politecnico di Milano, Italy
Fabrizio Ferrandi
Fabrizio Ferrandi
Politecnico di Milano, Italy
Leandro Fiorin
Leandro Fiorin
Politecnico di Milano, Italy
S
Serena Curzel
Politecnico di Milano, Italy
Luca Benini
Luca Benini
ETH ZĂźrich, UniversitĂ  di Bologna
Integrated CircuitsComputer ArchitectureEmbedded SystemsVLSIMachine Learning
Francesco Conti
Francesco Conti
Associate Professor, University of Bologna
Hardware acceleratorsDeep LearningUltra-Low Power Computing
Angelo Garofalo
Angelo Garofalo
University of Bologna, ETH Zurich
HW efficient Machine LearningHeterogeneous Computing ArchitecturesMixed-Criticality Systems
Cristian Zambelli
Cristian Zambelli
UniversitĂ  degli Studi di Ferrara, Italy
Enrico Calore
Enrico Calore
INFN Ferrara
HPCScientific computingParallel ProgrammingLow Power Embedded HardwareBrain-Computer Interfaces
Sebastiano Fabio Schifano
Sebastiano Fabio Schifano
University of Ferrara
Maurizio Palesi
Maurizio Palesi
Associate Professor, University of Catania, Italy
Embedded systemsNoCNetworks on Chip
G
Giuseppe Ascia
UniversitĂ  degli Studi di Catania, Italy
Davide Patti
Davide Patti
University of Catania
Networks on ChipDesign Space ExplorationArtificial IntelligenceBitcoinBlockchain
S
Stefania Perri
UniversitĂ  degli Studi della Calabria, Italy
N
Nicola Petra
UniversitĂ  degli Studi di Napoli Federico II, Italy
D
Davide De Caro
UniversitĂ  degli Studi di Napoli Federico II, Italy
L
Luciano Lavagno
Politecnico di Torino, Italy
T
Teodoro Urso
Politecnico di Torino, Italy
Valeria Cardellini
Valeria Cardellini
Università di Roma “Tor Vergata”, Italy
G
Gian Carlo Cardarilli
Università di Roma “Tor Vergata”, Italy
Robert Birke
Robert Birke
UniversitĂ  degli Studi Di Torino