Otters++: A Time-to-first-spike Based Energy Efficient Optical Spiking Transformer

📅 2026-06-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the energy inefficiency of conventional time-to-first-spike (TTFS) coding in spiking neural networks (SNNs), which stems from explicit computation of temporal decay. The authors leverage the intrinsic signal decay property of indium oxide (In₂O₃) optoelectronic devices—typically regarded as a hardware imperfection—and repurpose it as a computational resource to directly implement temporal decay in TTFS coding. By establishing functional equivalence between SNN layers and quantized neural network layers, and integrating straight-through gradient estimation, device-aware noise sampling, hybrid training, and knowledge distillation, the method effectively mitigates the non-differentiability and excessive sparsity challenges in TTFS-SNN training. The approach generalizes to Transformer architectures, achieving an average GLUE benchmark score of 84.17%, substantially outperforming existing spiking Transformer baselines while preserving exceptional energy efficiency.

📝 Abstract

Spiking neural networks (SNNs) are promising for energy-efficient inference, and time-to-first-spike (TTFS) coding is especially attractive because each neuron fires at most once. In practice, however, this benefit is often reduced by the cost of computing a temporal decay term and multiplying it by the synaptic weight. We address this issue by turning a physical hardware "bug," the natural signal decay in optoelectronic devices, into the main computation of TTFS, named Otters++. Specifically, we use the measured decay of a custom In$_2$O$_3$ optoelectronic synapse to directly realize the TTFS temporal term, removing the need for explicit digital decay computation. To scale this idea to Transformer models, we establish a layer-wise functional equivalence between the Otters++ and a quantized neural network (QNN), and develop a hybrid training method that uses device-faithful SNN computation in the forward pass and QNN straight-through gradients through the equivalent QNN path in the backward pass, together with model distillation. This avoids differentiation through discrete first-spike events and reduces the over-sparsity problem in direct TTFS-SNN training. We further make training aware of measured device noise by sampling run-to-run variation, and refine the system-level energy model by accounting for device sharing and multi-hop communication. On GLUE dataset, Otters++ improves the average score to 84.17\% while maintaining a clear energy advantage over prior spiking Transformer baselines. These results show that physically grounded TTFS computing can be efficient, trainable, and robust under realistic hardware effects.

Problem

Research questions and friction points this paper is trying to address.

Spiking Neural Networks

Time-to-first-spike

Energy Efficiency

Optoelectronic Devices

Transformer

Innovation

Methods, ideas, or system contributions that make the work stand out.

Time-to-first-spike (TTFS)

Optical spiking neural networks

Hardware-aware training