š¤ AI Summary
Backpropagation (BP) incurs high energy consumption and hardware implementation bottlenecks in large-scale neural networks. Method: This paper proposes Binary Random Forward-Forward (BFF), a energy-efficient supervised learning method that eliminates BP entirely. BFF extends the forward-forward algorithm to binary stochastic neurons implemented with probabilistic bits (p-bits) for efficient binary sampling; introduces bundled weights and differentiated biases to alleviate information bottlenecks; and replaces conventional matrix multiplication with low-overhead indexing operations. Contribution/Results: BFF removes BPās sequential dependencies and eliminates activation storage, drastically reducing computational and memory overhead. Evaluated on MNIST, Fashion-MNIST, and CIFAR-10, BFF achieves accuracy comparable to its real-valued forward-forward counterpart while delivering ~10Ć measured energy efficiency improvement. The method exhibits strong hardware friendliness and scalability, making it suitable for energy-constrained neuromorphic and edge AI systems.
š Abstract
Reducing energy consumption has become a pressing need for modern machine learning, which has achieved many of its most impressive results by scaling to larger and more energy-consumptive neural networks. Unfortunately, the main algorithm for training such networks, backpropagation, poses significant challenges for custom hardware accelerators, due to both its serial dependencies and the memory footprint needed to store forward activations for the backward pass. Alternatives to backprop, although less effective, do exist; here the main computational bottleneck becomes matrix multiplication. In this study, we derive forward-forward algorithms for binary, stochastic units. Binarization of the activations transforms matrix multiplications into indexing operations, which can be executed efficiently in hardware. Stochasticity, combined with tied weights across units with different biases, bypasses the information bottleneck imposed by binary units. Furthermore, although slow and expensive in traditional hardware, binary sampling that is very fast can be implemented cheaply with p-bits (probabilistic bits), novel devices made up of unstable magnets. We evaluate our proposed algorithms on the MNIST, Fashion-MNIST, and CIFAR-10 datasets, showing that its performance is close to real-valued forward-forward, but with an estimated energy savings of about one order of magnitude.