Lookup multivariate Kolmogorov-Arnold Networks

📅 2025-09-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the excessive parameter count and computational overhead of high-dimensional linear layers in deep learning, this paper proposes lookup multivariate Kolmogorov–Arnold Networks (lmKANs), which replace conventional linear mappings with trainable low-dimensional multivariate lookup-table functions. Inspired by the Kolmogorov–Arnold representation theorem and implemented via spline-based lookup tables, lmKANs achieve high-dimensional nonlinear transformations with minimal multiplications, enabling efficient CUDA acceleration and seamless integration into mainstream architectures—including MLPs and CNNs. Experiments on CIFAR-10, ImageNet-1k, and methane conformation datasets demonstrate that lmKANs maintain competitive accuracy while reducing inference FLOPs by 1.6×–6.0× and achieving up to 10.3× higher throughput on NVIDIA H100 GPUs. These results significantly improve the trade-off between model capacity and inference efficiency.

Technology Category

Application Category

📝 Abstract
High-dimensional linear mappings, or linear layers, dominate both the parameter count and the computational cost of most modern deep-learning models. We introduce a general drop-in replacement, lookup multivariate Kolmogorov-Arnold Networks (lmKANs), which deliver a substantially better trade-off between capacity and inference cost. Our construction expresses a general high-dimensional mapping through trainable low-dimensional multivariate functions. These functions can carry dozens or hundreds of trainable parameters each, and yet it takes only a few multiplications to compute them because they are implemented as spline lookup tables. Empirically, lmKANs reduce inference FLOPs by up to 6.0x while matching the flexibility of MLPs in general high-dimensional function approximation. In another feedforward fully connected benchmark, on the tabular-like dataset of randomly displaced methane configurations, lmKANs enable more than 10x higher H100 throughput at equal accuracy. Within frameworks of Convolutional Neural Networks, lmKAN-based CNNs cut inference FLOPs at matched accuracy by 1.6-2.1x and by 1.7x on the CIFAR-10 and ImageNet-1k datasets, respectively. Our code, including dedicated CUDA kernels, is available online at https://github.com/schwallergroup/lmkan.
Problem

Research questions and friction points this paper is trying to address.

Replacing high-dimensional linear layers in deep learning
Reducing computational cost and parameter count
Improving trade-off between capacity and inference efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Replaces linear layers with lookup multivariate KANs
Uses trainable low-dimensional multivariate spline functions
Implements functions as spline lookup tables
🔎 Similar Papers
No similar papers found.