🤖 AI Summary
This work addresses the challenge of efficiently implementing the learnable nonlinear edge functions in Kolmogorov–Arnold Networks (KANs) in hardware by proposing an analog KAN architecture based on a Reconfigurable Nonlinear Processing Unit (RNPU). Leveraging multi-terminal nanosilicon devices that natively support programmable nonlinear transformations, the design employs the RNPU as its fundamental computational element, integrated with analog mixed-signal interfaces to achieve high parameter efficiency, low power consumption, and minimal silicon area for edge neural network deployment. Experimental results demonstrate that, at comparable approximation error levels, the proposed architecture reduces energy consumption by two to three orders of magnitude and chip area by approximately one order of magnitude relative to digital fixed-point MLP implementations, achieving a single-inference energy cost of merely 250 pJ with a latency of about 600 ns.
📝 Abstract
Kolmogorov-Arnold Networks (KANs) shift neural computation from linear layers to learnable nonlinear edge functions, but implementing these nonlinearities efficiently in hardware remains an open challenge. Here we introduce a physical analog KAN architecture in which edge functions are realized in materia using reconfigurable nonlinear-processing units (RNPUs): multi-terminal nanoscale silicon devices whose input-output characteristics are tuned via control voltages. By combining multiple RNPUs into an edge processor and assembling these blocks into a reconfigurable analog KAN (aKAN) architecture with integrated mixed-signal interfacing, we establish a realistic system-level hardware implementation that enables compact KAN-style regression and classification with programmable nonlinear transformations. Using experimentally calibrated RNPU models and hardware measurements, we demonstrate accurate function approximation across increasing task complexity while requiring fewer or comparable trainable parameters than multilayer perceptrons (MLPs). System-level estimates indicate an energy per inference of $\sim$250 pJ and an end-to-end inference latency of $\sim$600 ns for a representative workload, corresponding to a $\sim$10$^{2}$-10$^{3}\times$ reduction in energy accompanied by a $\sim$10$\times$ reduction in area compared to a digital fixed-point MLP at similar approximation error. These results establish RNPUs as scalable, hardware-native nonlinear computing primitives and identify analog KAN architectures as a realistic silicon-based pathway toward energy-, latency-, and footprint-efficient analog neural-network hardware, particularly for edge inference.