🤖 AI Summary
This study addresses the instability in spike-based gradient descent caused by the discontinuous loss landscape of traditional leaky integrate-and-fire (LIF) neurons, where minor parameter perturbations can abruptly create or eliminate spikes, leading to training instability and neuronal silencing. The authors systematically compare LIF and quadratic integrate-and-fire (QIF) neurons on the Spiking Heidelberg Digits dataset and provide the first empirical evidence that QIF neurons—owing to their continuous spiking dynamics—substantially mitigate loss landscape fragmentation, thereby enhancing training stability and performance. Through exact gradient computation, hyperparameter optimization, and loss landscape visualization, they demonstrate that QIF models exhibit smoother loss surfaces and more stable gradients, effectively avoiding discontinuities induced by abrupt changes in spike timing. These findings offer compelling justification for replacing LIF with continuous spiking neuron models in gradient-based learning frameworks.
📝 Abstract
The ability to train spiking neural networks is essential for modeling biological neural networks as well as for neuromorphic computing. However, for the extensively used leaky integrate-and-fire (LIF) neurons, arbitrarily small parameter changes can induce spike (dis)appearances that disrupt subsequent activity, leading to unstable neural representations and permanently silent neurons during exact spike-based gradient descent. Recent work shows that a class of neuron models, which includes the quadratic integrate-and-fire (QIF) neuron, avoids these discontinuities and enables continuous and even smooth spike-based gradient descent. However, it remains unclear whether these advantages translate into practice. Here, we demonstrate that they do so via a controlled comparison between networks of LIF and QIF neurons on the popular Spiking Heidelberg Digits dataset. Specifically, in a first step, we perform a thorough hyperparameter search to optimize both models, revealing a clear performance advantage of QIF neurons. In a second step, we visualize the loss and gradient landscapes. Consistent with their inferior performance, we find that the loss landscapes of LIF neurons, which are discontinuous, appear more fragmented and the related gradients more erratic. An analysis of the landscapes of single samples indicates that these features arise from changes in the temporal order of spikes, which often cause disruptive spike (dis)appearances. Overall, our results advocate replacing LIF neurons with neuron models exhibiting continuous spiking dynamics, such as QIF neurons, for gradient descent training.