Compressing Neural Networks Using Tensor Networks with Exponentially Fewer Variational Parameters

📅 2023-05-10
🏛️ Intelligent Computing
📈 Citations: 6
Influential: 0
📄 PDF

career value

208K/year
🤖 AI Summary
To address the challenges of excessive parameter counts, overfitting susceptibility, and high hardware overhead in deep neural networks, this paper proposes a general model compression framework based on Automatically Differentiable Deep Tensor Networks (ADTN). Unlike conventional matrix- or tensor-based representations, ADTN unifies diverse neural network layers—including linear and convolutional layers—into a deep, differentiable tensor structure. It achieves exponential parameter reduction via high-order tensor decomposition and end-to-end joint optimization. On VGG-16, linear-layer parameters are compressed from millions to just 424, while CIFAR-10 accuracy improves by 1.57%. The method’s generality and efficiency are further validated on LeNet-5 and AlexNet across MNIST and CIFAR datasets. Key contributions include: (i) the first deep automatically differentiable tensor network architecture; (ii) simultaneous achievement of extreme compression, enhanced generalization, and hardware efficiency; and (iii) a unified, end-to-end trainable framework applicable across diverse network topologies and tasks.
📝 Abstract
Neural network (NN) designed for challenging machine learning tasks is in general a highly nonlinear mapping that contains massive variational parameters. High complexity of NN, if unbounded or unconstrained, might unpredictably cause severe issues including R{overfitting}, loss of generalization power, and unbearable cost of hardware. In this work, we propose a general compression scheme that significantly reduces the variational parameters of NN's, despite of their specific types (linear, convolutional, extit{etc}), by encoding them to deep R{automatically differentiable} tensor network (ADTN) that contains exponentially-fewer free parameters. Superior compression performance of our scheme is demonstrated on several widely-recognized NN's (FC-2, LeNet-5, AlextNet, ZFNet and VGG-16) and datasets (MNIST, CIFAR-10 and CIFAR-100). For instance, we compress two linear layers in VGG-16 with approximately $10^{7}$ parameters to two ADTN's with just 424 parameters, improving the testing accuracy on CIFAR-10 from $90.17%$ to $91.74%$. We argue that the deep structure of ADTN is an essential reason for the remarkable compression performance of ADTN, compared to existing compression schemes that are mainly based on tensor decompositions/factorization and shallow tensor networks. Our work suggests deep TN as an exceptionally efficient mathematical structure for representing the variational parameters of NN's, which exhibits superior compressibility over the commonly-used matrices and multi-way arrays.
Problem

Research questions and friction points this paper is trying to address.

Reducing neural network parameters to prevent overfitting and high costs
Compressing diverse NN types using deep tensor networks
Enhancing accuracy with exponentially fewer variational parameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Compress neural networks using tensor networks
Exponentially fewer variational parameters
Deep automatically differentiable tensor network
Y
Yong Qing
Center for Quantum Physics and Intelligent Sciences, Department of Physics, Capital Normal University, Beijing 10048, China
P
P. Zhou
Center for Quantum Physics and Intelligent Sciences, Department of Physics, Capital Normal University, Beijing 10048, China
K
Ke Li
Center for Quantum Physics and Intelligent Sciences, Department of Physics, Capital Normal University, Beijing 10048, China
S
Shi-Ju Ran
Center for Quantum Physics and Intelligent Sciences, Department of Physics, Capital Normal University, Beijing 10048, China