🤖 AI Summary
This work addresses the energy inefficiency of conventional time-to-first-spike (TTFS) coding in spiking neural networks (SNNs), which stems from explicit computation of temporal decay. The authors leverage the intrinsic signal decay property of indium oxide (In₂O₃) optoelectronic devices—typically regarded as a hardware imperfection—and repurpose it as a computational resource to directly implement temporal decay in TTFS coding. By establishing functional equivalence between SNN layers and quantized neural network layers, and integrating straight-through gradient estimation, device-aware noise sampling, hybrid training, and knowledge distillation, the method effectively mitigates the non-differentiability and excessive sparsity challenges in TTFS-SNN training. The approach generalizes to Transformer architectures, achieving an average GLUE benchmark score of 84.17%, substantially outperforming existing spiking Transformer baselines while preserving exceptional energy efficiency.
📝 Abstract
Spiking neural networks (SNNs) are promising for energy-efficient inference, and time-to-first-spike (TTFS) coding is especially attractive because each neuron fires at most once. In practice, however, this benefit is often reduced by the cost of computing a temporal decay term and multiplying it by the synaptic weight. We address this issue by turning a physical hardware "bug," the natural signal decay in optoelectronic devices, into the main computation of TTFS, named Otters++. Specifically, we use the measured decay of a custom In$_2$O$_3$ optoelectronic synapse to directly realize the TTFS temporal term, removing the need for explicit digital decay computation. To scale this idea to Transformer models, we establish a layer-wise functional equivalence between the Otters++ and a quantized neural network (QNN), and develop a hybrid training method that uses device-faithful SNN computation in the forward pass and QNN straight-through gradients through the equivalent QNN path in the backward pass, together with model distillation. This avoids differentiation through discrete first-spike events and reduces the over-sparsity problem in direct TTFS-SNN training. We further make training aware of measured device noise by sampling run-to-run variation, and refine the system-level energy model by accounting for device sharing and multi-hop communication. On GLUE dataset, Otters++ improves the average score to 84.17\% while maintaining a clear energy advantage over prior spiking Transformer baselines. These results show that physically grounded TTFS computing can be efficient, trainable, and robust under realistic hardware effects.