🤖 AI Summary
This work proposes a retraction-free second-order optimization algorithm on the Stiefel manifold that simultaneously optimizes the objective function while strictly preserving constraint feasibility, addressing the limitations of existing first-order retraction-free methods in achieving high accuracy. The key innovation lies in the novel integration of the Newton–Schulz iteration into a second-order optimization framework, revealing its geometric connection to the normal space of the manifold. By combining a tangential corrected Newton equation with normal-space Newton–Schulz orthogonalization, the method achieves local quadratic convergence. Experimental results demonstrate that the proposed algorithm significantly outperforms state-of-the-art approaches on benchmark tasks including the orthogonal Procrustes problem, principal component analysis, and independent component analysis.
📝 Abstract
Retraction-free approaches offer attractive low-cost alternatives to Riemannian methods on the Stiefel manifold, but they are often first-order, which may limit the efficiency under high-accuracy requirements. To this end, we propose a second-order method landing on the Stiefel manifold without invoking retractions, which is proved to enjoy local quadratic (or superlinear for its inexact variant) convergence. The update consists of the sum of (i) a component tangent to the level set of the constraint-defining function that aims to reduce the objective and (ii) a component normal to the same level set that reduces the infeasibility. Specifically, we construct the normal component via Newton$\unicode{x2013}$Schulz, a fixed-point iteration for orthogonalization. Moreover, we establish a geometric connection between the Newton$\unicode{x2013}$Schulz iteration and Stiefel manifolds, in which Newton$\unicode{x2013}$Schulz moves along the normal space. For the tangent component, we formulate a modified Newton equation that incorporates Newton$\unicode{x2013}$Schulz. Numerical experiments on the orthogonal Procrustes problem, principal component analysis, and real-data independent component analysis illustrate that the proposed method performs better than the existing methods.