Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the slow convergence, high memory overhead, and insufficient theoretical characterization of Consensus-Based Optimization (CBO) for two-layer neural networks in multi-task learning. Methodologically: (1) it formulates a joint dynamical model on the Wasserstein-over-Wasserstein space, coupling particle-system evolution with network-parameter dynamics; (2) it designs a CBO-Adam hybrid optimizer to accelerate convergence; and (3) it reformulates the CBO update rule to substantially reduce memory complexity in multi-task settings. Theoretically, it establishes— for the first time under the infinite-particle limit—a rigorously derived Wasserstein-over-Wasserstein dynamical system whose variance monotonically decreases, yielding a provably optimal mean-field interpretation of CBO. Empirically, the hybrid strategy achieves significant convergence speedup while maintaining high efficiency and scalability across diverse multi-task benchmarks.

Technology Category

Application Category

📝 Abstract
We study two-layer neural networks and train these with a particle-based method called consensus-based optimization (CBO). We compare the performance of CBO against Adam on two test cases and demonstrate how a hybrid approach, combining CBO with Adam, provides faster convergence than CBO. In the context of multi-task learning, we recast CBO into a formulation that offers less memory overhead. The CBO method allows for a mean-field limit formulation, which we couple with the mean-field limit of the neural network. To this end, we first reformulate CBO within the optimal transport framework. Finally, in the limit of infinitely many particles, we define the corresponding dynamics on the Wasserstein-over-Wasserstein space and show that the variance decreases monotonically.
Problem

Research questions and friction points this paper is trying to address.

Training two-layer neural networks using consensus-based optimization methods
Developing hybrid approaches combining CBO with Adam for faster convergence
Establishing mean-field limits for neural networks within optimal transport framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid CBO-Adam approach for faster convergence
Memory-efficient CBO reformulation for multi-task learning
Mean-field CBO dynamics on Wasserstein-over-Wasserstein space
🔎 Similar Papers
No similar papers found.