🤖 AI Summary
This work addresses fundamental theoretical challenges in deep neural networks (DNNs)—including expressivity, algorithmic learnability, computational complexity, generalization, and parameter identifiability—by proposing a unified mathematical framework grounded in low-rank tensor decomposition. We systematically introduce low-rank tensor modeling, previously established in signal processing and machine learning, into DNN theory. Specifically, we establish an explicit mapping between network architecture and tensor factors, and leverage polynomial-time tensor decomposition algorithms together with uniqueness theorems to derive rigorous characterizations of DNN learnability and generalization bounds. The resulting interdisciplinary framework not only uncovers the intrinsic algebraic structure underlying deep learning but also provides a principled, mathematically grounded foundation for model design and analysis. This approach bridges abstract theoretical insights with practical algorithmic guarantees, advancing the interpretability and rigor of deep learning theory.
📝 Abstract
The groundbreaking performance of deep neural networks (NNs) promoted a surge of interest in providing a mathematical basis to deep learning theory. Low-rank tensor decompositions are specially befitting for this task due to their close connection to NNs and their rich theoretical results. Different tensor decompositions have strong uniqueness guarantees, which allow for a direct interpretation of their factors, and polynomial time algorithms have been proposed to compute them. Through the connections between tensors and NNs, such results supported many important advances in the theory of NNs. In this review, we show how low-rank tensor methods--which have been a core tool in the signal processing and machine learning communities--play a fundamental role in theoretically explaining different aspects of the performance of deep NNs, including their expressivity, algorithmic learnability and computational hardness, generalization, and identifiability. Our goal is to give an accessible overview of existing approaches (developed by different communities, ranging from computer science to mathematics) in a coherent and unified way, and to open a broader perspective on the use of low-rank tensor decompositions for the theory of deep NNs.