Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work investigates fast matrix multiplication algorithms by seeking low-rank decompositions of the matrix multiplication tensor. To this end, the authors propose StrassenNet, a novel neural network architecture that, for the first time, automatically learns and numerically reproduces optimal low-rank decompositions through end-to-end training. In the 2×2 case, the model consistently converges to a rank-7 solution, accurately recovering Strassen’s algorithm. For the 3×3 case, the approach identifies a rank-23 decomposition that significantly outperforms all models of rank ≤22, providing strong evidence that 23 is the minimal effective rank. Furthermore, the study introduces an ε-parameterization technique to model border rank decompositions, thereby extending the applicability of neural networks to more general tensor decomposition problems.

Technology Category

Application Category

📝 Abstract

Fast matrix multiplication can be described as searching for low-rank decompositions of the matrix--multiplication tensor. We design a neural architecture, \textsc{StrassenNet}, which reproduces the Strassen algorithm for $2\times 2$ multiplication. Across many independent runs the network always converges to a rank-$7$ tensor, thus numerically recovering Strassen's optimal algorithm. We then train the same architecture on $3\times 3$ multiplication with rank $r\in\{19,\dots,23\}$. Our experiments reveal a clear numerical threshold: models with $r=23$ attain significantly lower validation error than those with $r\le 22$, suggesting that $r=23$ could actually be the smallest effective rank of the matrix multiplication tensor $3\times 3$. We also sketch an extension of the method to border-rank decompositions via an $\varepsilon$--parametrisation and report preliminary results consistent with the known bounds for the border rank of the $3\times 3$ matrix--multiplication tensor.

Problem

Research questions and friction points this paper is trying to address.

matrix multiplication

tensor rank

fast algorithms

Strassen algorithm

border rank

Innovation

Methods, ideas, or system contributions that make the work stand out.

StrassenNet

matrix multiplication tensor

low-rank decomposition