SupraSNN: Exploiting Synapse-Level Parallelism in Spiking Neural Network Accelerators through Co-Optimized Mapping and Scheduling

📅 2026-06-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently exploiting synaptic-level parallelism in spiking neural networks (SNNs) on hardware by proposing a software-hardware co-design framework inspired by superscalar architectures. Treating synaptic events as parallelizable micro-operations, the framework physically decouples synaptic and neuronal computations and integrates multicast tree routing, parallel synaptic processing units, and a reduction tree structure, while introducing centralized neuron state management to minimize redundancy. Coupled with memory-constrained network partitioning and a heuristic scheduling strategy, the design substantially enhances parallel efficiency. Implemented on a Xilinx Zynq FPGA, it achieves 93.44% accuracy on MNIST with 149 μs latency and 0.025 mJ per image, and 71.82% accuracy on the Spiking Heidelberg dataset with 1.41 ms latency and 0.77 mJ per sample—reducing latency by 47.6% and improving energy efficiency by 5.6× over existing FPGA-based SNN accelerators.
📝 Abstract
Spiking Neural Networks (SNNs) offer a brain-inspired path toward highly efficient computation, but their practical deployment is constrained by the challenge of managing and executing their massive parallelism on physical hardware. This problem mirrors the historical challenge in processor design of moving beyond serial execution, a barrier broken by superscalar architectures that dispatch multiple instructions to parallel functional units. Drawing inspiration from this paradigm, we introduce a hardware-software co-design framework that treats synaptic events as parallelizable micro-operations. We present SupraSNN, a superscalar-inspired architecture that achieves high synapse-level parallelism by physically decoupling synaptic and neuronal computations. Within this architecture, a Multi-Cast Tree routes spike data to multiple parallel Synapse Processing Units serve as the computational pipelines, while a Merge Tree consolidates distributed results for processing by a unified Neuron Unit--deliberately centralizing complex neuron state dynamics to mitigate hardware overhead and resource duplication. The efficacy of this architecture is enabled by a sophisticated partitioning and scheduling framework that first maps the SNN onto hardware respecting memory constraints, then heuristic scheduling determines the synaptic execution order, maximizing throughput and resource utilization. Implementing a feedforward SNN trained on MNIST (93.44% accuracy), SupraSNN achieves 149 $μs$ inference latency and 0.025 mJ per image (0.276 nJ per synapse) on the Xilinx Zynq XC7Z020 FPGA--delivering 47.6% lower latency and 5.6$\times$ better energy efficiency than prior FPGA-based SNN accelerators. Beyond vision tasks, a recurrent SNN on the Spiking Heidelberg Dataset (71.82% accuracy) achieves 1.41 ms latency and 0.77 mJ per sample on XC7Z030.
Problem

Research questions and friction points this paper is trying to address.

Spiking Neural Networks
synapse-level parallelism
hardware acceleration
parallel execution
computational efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spiking Neural Networks
Synapse-Level Parallelism
Superscalar Architecture
Hardware-Software Co-Design
Neuromorphic Computing
🔎 Similar Papers
No similar papers found.
S
Seyed Sadra Ghavami
High-Performance Embedded Architecture Laboratory (HiPEAL), School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran
M
Mohammad Hossein Nikkhah
High-Performance Embedded Architecture Laboratory (HiPEAL), School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran
M
Mohammad Rasoul Roshanshah
High-Performance Embedded Architecture Laboratory (HiPEAL), School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran
Saeed Safari
Saeed Safari
Associate Prof. at University of Tehran
Computer ArchitectureFault Tolerant System Designand AI