🤖 AI Summary
The theoretical foundations of ternary (three-valued) ReLU neural networks remain underdeveloped, particularly regarding the growth rate of their number of linear regions—a key measure of expressive capacity.
Method: We conduct the first rigorous lower-bound analysis on the number of linear regions in ternary ReLU networks, integrating combinatorial geometry, activation pattern analysis, and piecewise-linear structure theory.
Contribution/Results: We prove that the number of linear regions grows exponentially with network depth and polynomially with width. Crucially, we show that squaring the width or doubling the depth suffices for a ternary network to achieve the same lower bound on expressive capacity as a full-precision ReLU network. This is the first theoretical demonstration that parameter-constrained ternary networks can match the complexity of standard networks through modest architectural scaling—providing a foundational theoretical guarantee for energy-efficient neural network design.
📝 Abstract
With the advancement of deep learning, reducing computational complexity and memory consumption has become a critical challenge, and ternary neural networks (NNs) that restrict parameters to ${-1, 0, +1}$ have attracted attention as a promising approach. While ternary NNs demonstrate excellent performance in practical applications such as image recognition and natural language processing, their theoretical understanding remains insufficient. In this paper, we theoretically analyze the expressivity of ternary NNs from the perspective of the number of linear regions. Specifically, we evaluate the number of linear regions of ternary regression NNs with Rectified Linear Unit (ReLU) for activation functions and prove that the number of linear regions increases polynomially with respect to network width and exponentially with respect to depth, similar to standard NNs. Moreover, we show that it suffices to either square the width or double the depth of ternary NNs to achieve a lower bound on the maximum number of linear regions comparable to that of general ReLU regression NNs. This provides a theoretical explanation, in some sense, for the practical success of ternary NNs.