An Augmented Backward-Corrected Projector Splitting Integrator for Dynamical Low-Rank Training

📅 2025-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high computational cost and poor convergence robustness in training dynamic low-rank neural networks, primarily caused by frequent QR/SVD decompositions. We propose an enhanced backward correction-based projection-splitting integrator. Our key innovation is the first incorporation of an augmentation step into the projection-splitting framework, which significantly reduces the number of QR decompositions while preserving local optimality and ensuring rigorous theoretical convergence guarantees. The method integrates dynamic low-rank approximation, projection-splitting integration, backward correction, and numerical solutions to matrix differential equations. Extensive experiments on multiple benchmark tasks demonstrate improved training stability, accelerated convergence, and substantially reduced QR/SVD computational overhead—particularly for small-rank matrices.

Technology Category

Application Category

📝 Abstract
Layer factorization has emerged as a widely used technique for training memory-efficient neural networks. However, layer factorization methods face several challenges, particularly a lack of robustness during the training process. To overcome this limitation, dynamical low-rank training methods have been developed, utilizing robust time integration techniques for low-rank matrix differential equations. Although these approaches facilitate efficient training, they still depend on computationally intensive QR and singular value decompositions of matrices with small rank. In this work, we introduce a novel low-rank training method that reduces the number of required QR decompositions. Our approach integrates an augmentation step into a projector-splitting scheme, ensuring convergence to a locally optimal solution. We provide a rigorous theoretical analysis of the proposed method and demonstrate its effectiveness across multiple benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Enhances robustness in low-rank training
Reduces QR decompositions in training
Ensures convergence to optimal solutions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Augmented Backward-Corrected Projector Integrator
Reduces QR Decompositions in Training
Ensures Convergence to Optimal Solution