Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach

πŸ“… 2026-02-25
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work investigates fast matrix multiplication algorithms by seeking low-rank decompositions of the matrix multiplication tensor. To this end, the authors propose StrassenNet, a novel neural network architecture that, for the first time, automatically learns and numerically reproduces optimal low-rank decompositions through end-to-end training. In the 2Γ—2 case, the model consistently converges to a rank-7 solution, accurately recovering Strassen’s algorithm. For the 3Γ—3 case, the approach identifies a rank-23 decomposition that significantly outperforms all models of rank ≀22, providing strong evidence that 23 is the minimal effective rank. Furthermore, the study introduces an Ξ΅-parameterization technique to model border rank decompositions, thereby extending the applicability of neural networks to more general tensor decomposition problems.

Technology Category

Application Category

πŸ“ Abstract
Fast matrix multiplication can be described as searching for low-rank decompositions of the matrix--multiplication tensor. We design a neural architecture, \textsc{StrassenNet}, which reproduces the Strassen algorithm for $2\times 2$ multiplication. Across many independent runs the network always converges to a rank-$7$ tensor, thus numerically recovering Strassen's optimal algorithm. We then train the same architecture on $3\times 3$ multiplication with rank $r\in\{19,\dots,23\}$. Our experiments reveal a clear numerical threshold: models with $r=23$ attain significantly lower validation error than those with $r\le 22$, suggesting that $r=23$ could actually be the smallest effective rank of the matrix multiplication tensor $3\times 3$. We also sketch an extension of the method to border-rank decompositions via an $\varepsilon$--parametrisation and report preliminary results consistent with the known bounds for the border rank of the $3\times 3$ matrix--multiplication tensor.
Problem

Research questions and friction points this paper is trying to address.

matrix multiplication
tensor rank
fast algorithms
Strassen algorithm
border rank
Innovation

Methods, ideas, or system contributions that make the work stand out.

StrassenNet
matrix multiplication tensor
low-rank decomposition
border rank
neural architecture
πŸ”Ž Similar Papers
No similar papers found.
Paolo Andreini
Paolo Andreini
Unknown affiliation
A
Alessandra Bernardi
M
Monica Bianchini
B
Barbara Toniella Corradini
S
Sara Marziali
G
Giacomo Nunziati
Franco Scarselli
Franco Scarselli
University of Siena
Deep learningGraph neural networksTheory of neural networksComputer vision by deep learning