Thermodynamically Optimal Regularization under Information-Geometric Constraints

📅 2026-01-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the lack of a unified theoretical foundation in existing regularization methods and their inability to guarantee thermodynamic energy efficiency during learning. By integrating information geometry, the principle of maximum entropy, and thermodynamic optimality, the study proposes that regularization should minimize the Fisher–Rao distance between belief states and a reference state. It is rigorously shown that this metric is the unique geometric structure satisfying both parameter invariance and the maximum entropy assumption. Optimal regularization forms are derived on hyperbolic and von Mises manifolds for Gaussian and circular distribution models, respectively. This establishes, for the first time, a theoretical framework linking regularization to thermodynamic efficiency and yields experimentally testable predictions.

Technology Category

Application Category

📝 Abstract

Modern machine learning relies on a collection of empirically successful but theoretically heterogeneous regularization techniques, such as weight decay, dropout, and exponential moving averages. At the same time, the rapidly increasing energetic cost of training large models raises the question of whether learning algorithms approach any fundamental efficiency bound. In this work, we propose a unifying theoretical framework connecting thermodynamic optimality, information geometry, and regularization. Under three explicit assumptions -- (A1) that optimality requires an intrinsic, parametrization-invariant measure of information, (A2) that belief states are modeled by maximum-entropy distributions under known constraints, and (A3) that optimal processes are quasi-static -- we prove a conditional optimality theorem. Specifically, the Fisher--Rao metric is the unique admissible geometry on belief space, and thermodynamically optimal regularization corresponds to minimizing squared Fisher--Rao distance to a reference state. We derive the induced geometries for Gaussian and circular belief models, yielding hyperbolic and von Mises manifolds, respectively, and show that classical regularization schemes are structurally incapable of guaranteeing thermodynamic optimality. We introduce a notion of thermodynamic efficiency of learning and propose experimentally testable predictions. This work provides a principled geometric and thermodynamic foundation for regularization in machine learning.

Problem

Research questions and friction points this paper is trying to address.

regularization

thermodynamic optimality

information geometry

machine learning

Fisher–Rao metric

Innovation

Methods, ideas, or system contributions that make the work stand out.

thermodynamic optimality

information geometry

Fisher–Rao metric

regularization

maximum-entropy distributions

🔎 Similar Papers

No similar papers found.

Authors to Follow