🤖 AI Summary
This study investigates the fundamental cause of failure in continual learning: even when a multi-task compatible solution exists, the irreversibility inherent in finite-time learning processes can obstruct the acquisition of new tasks. By modeling learning as a nonequilibrium transport process in the space of parameter distributions and leveraging tools from nonequilibrium thermodynamics and information geometry—such as entropy production and the cognitive speed limit—the work introduces the concept of “critical period closure” to reveal how representational degrees of freedom are irreversibly lost due to finite dissipation. This reframes catastrophic forgetting not as task interference but as a dynamical constraint, thereby transcending conventional continual learning frameworks and systematically elucidating the intrinsic origin of constrained learning trajectories.
📝 Abstract
Learning performed over finite time is inherently irreversible. In Part~I of this series, we modeled learning as a transport process in the space of parameter distributions and derived the Epistemic Speed Limit (ESL), which lower-bounds entropy production under finite-time dynamics. In this work (Part~II), we show that irreversibility imposes a geometric restriction on future adaptability through the compositional structure of learning dynamics. Successive learning phases compose multiplicatively as transport maps, and their Jacobians form a semigroup whose rank and singular values are submultiplicative. As a result, dynamically usable degrees of reconfiguration can only decrease under composition. We formalize the remaining adaptability of a model in terms of compatible effective rank, defined as the log-volume of task-preserving directions that remain dynamically accessible. Although task performance may remain unchanged, finite-time learning can progressively reduce this reconfiguration capacity. We prove a capacity-threshold criterion for continual learning: let m_B denote the stable rank of the Hessian of a new task B restricted to the task-preserving manifold of a previously learned task A. If m_B exceeds the residual compatible effective rank, then task B is trajectory-level incompatible with task A; any sufficient adaptation necessarily induces forgetting. Thus catastrophic forgetting arises not from the absence of multi-task solutions, but from irreversible loss of reconfiguration capacity under compositional learning dynamics. This establishes a trajectory-level capacity limit for continual learning.