🤖 AI Summary
Conventional fixed activation functions suffer from limited representational capacity. Method: This paper proposes a learnable nonlinear activation framework grounded in semiring algebra, generalizing the standard linear–nonlinear alternation paradigm in neural networks to a differentiable semiring convolutional framework. We design parameterized semiring activation operators and integrate them—novelly—into fully connected layers and the ConvNeXt architecture. Moreover, we unify operations such as max-pooling and min-pooling as special cases of tropical semirings, thereby expanding the design space of neural operators. Contribution/Results: Experiments on standard image classification benchmarks demonstrate that our method significantly enhances model representational diversity while preserving accuracy. It reveals a previously unobserved trade-off between convergence speed and generalization performance enabled by semiring-based activations, offering new insights into activation function design and optimization dynamics.
📝 Abstract
We introduce a class of trainable nonlinear operators based on semirings that are suitable for use in neural networks. These operators generalize the traditional alternation of linear operators with activation functions in neural networks. Semirings are algebraic structures that describe a generalised notation of linearity, greatly expanding the range of trainable operators that can be included in neural networks. In fact, max- or min-pooling operations are convolutions in the tropical semiring with a fixed kernel. We perform experiments where we replace the activation functions for trainable semiring-based operators to show that these are viable operations to include in fully connected as well as convolutional neural networks (ConvNeXt). We discuss some of the challenges of replacing traditional activation functions with trainable semiring activations and the trade-offs of doing so.