Spiking and Event-driven Neuromorphic Mamba Models for Efficient Speech Recognition

📅 2026-05-31
📈 Citations: 0
Influential: 0
📄 PDF

career value

220K/year
🤖 AI Summary
This work addresses the high computational cost and energy consumption of deep neural networks in automatic speech recognition (ASR), which hinder deployment on resource-constrained edge devices. It presents the first systematic exploration of neuromorphic computing for ASR, introducing event-driven and spiking neural network (SNN) variants of the SpeechMamba model. To reduce computational load, the authors propose the FATReLU activation function combined with activation sparsification. A cycle-accurate, event-driven simulator is also developed to enable algorithm-hardware co-optimization. Experimental results demonstrate that the event-driven model achieves over 60% activation sparsity with less than 1% accuracy degradation, while the SNN variant attains more than 70% sparsity and a 30% reduction in parameter count. The proposed simulator further yields over 10% improvement in energy efficiency.
📝 Abstract
Deep learning has greatly advanced automatic speech recognition (ASR), enabling widespread deployment on edge devices such as smartphones and smart home systems. However, the computational and energy demands of deep neural networks pose significant challenges for such resource-constrained deployments, introducing latency and limiting real-time interaction. Neuromorphic computing offers a promising solution by introducing activation sparsity through spiking neural networks (SNNs) and event-driven neural networks, converting dense operations into sparse computations. However, a study that evaluates the hardware benefits of different neuromorphic strategies remains lacking for ASR. This paper explores spiking and event-driven neuromorphic neural networks to improve activation sparsity in the state-of-the-art SpeechMamba model for ASR. We introduce an event-driven SpeechMamba with FATReLU activation, achieving over 60% activation sparsity with less than 1% accuracy degradation on LibriSpeech. We also propose a spiking SpeechMamba that attains over 70% sparsity while using 30% fewer parameters than comparable SNNs. Finally, we develop a cycle-accurate event-driven simulator enabling flexible algorithm-hardware co-exploration, which helps us identify computational bottlenecks and yields over 10% additional efficiency improvements.
Problem

Research questions and friction points this paper is trying to address.

automatic speech recognition
neuromorphic computing
activation sparsity
edge devices
computational efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spiking Neural Networks
Event-driven Computing
Activation Sparsity
SpeechMamba
Neuromorphic ASR
🔎 Similar Papers
No similar papers found.