Weight-Space Linear Recurrent Neural Networks

📅 2025-06-01

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Traditional RNNs compress temporal dynamics into fixed-dimensional hidden states, resulting in low memory resolution, difficulty incorporating physical priors, and reliance on gradient-based updates during inference. To address these limitations, we propose WARP—a novel framework that explicitly parameterizes the RNN’s hidden state as the *weights* of a root neural network and performs linear recursion directly in the weight space. This paradigm enables gradient-free online adaptation, high-resolution memory modeling, and seamless integration of domain-specific physical constraints. WARP offers enhanced interpretability—weight evolution trajectories are explicit and analyzable—while maintaining strong generalization across diverse tasks. Empirically, it achieves or surpasses state-of-the-art performance on sequence classification, image inpainting, dynamical system reconstruction, and multivariate time-series forecasting.

Technology Category

Application Category

📝 Abstract

We introduce WARP (Weight-space Adaptive Recurrent Prediction), a simple yet powerful framework that unifies weight-space learning with linear recurrence to redefine sequence modeling. Unlike conventional recurrent neural networks (RNNs) which collapse temporal dynamics into fixed-dimensional hidden states, WARP explicitly parametrizes the hidden state as the weights of a distinct root neural network. This formulation promotes higher-resolution memory, gradient-free adaptation at test-time, and seamless integration of domain-specific physical priors. Empirical validation shows that WARP matches or surpasses state-of-the-art baselines on diverse classification tasks, spanning synthetic benchmarks to real-world datasets. Furthermore, extensive experiments across sequential image completion, dynamical system reconstruction, and multivariate time series forecasting demonstrate its expressiveness and generalization capabilities. Critically, WARP's weight trajectories offer valuable insights into the model's inner workings. Ablation studies confirm the architectural necessity of key components, solidifying weight-space linear RNNs as a transformative paradigm for adaptive machine intelligence.

Problem

Research questions and friction points this paper is trying to address.

Unifies weight-space learning with linear recurrence for sequence modeling

Enhances memory resolution and test-time adaptation without gradients

Demonstrates superior performance in diverse classification and forecasting tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies weight-space learning with linear recurrence

Parametrizes hidden state as root network weights

Enables gradient-free adaptation and physical priors

🔎 Similar Papers

State-Space Modeling in Long Sequence Processing: A Survey on Recurrence in the Transformer Era