🤖 AI Summary
Existing linearly convergent algorithms for high-dimensional sparse convex optimization suffer from dense gradient computations per iteration, failing to exploit solution sparsity for practical efficiency.
Method: We propose a fully sparse first-order gradient method that, for the first time, achieves linear convergence while estimating gradients and performing updates solely on the support set of the current iterate. The method leverages ℓ₁-Lipschitz continuity of the gradient and ℓ₂-quadratic growth of the objective, yielding a convergence rate dependent on the mixed condition number β₁ₛ/α₂.
Contribution/Results: Each iteration incurs only O(s log d) computational complexity—where s is the sparsity level of the optimal solution and d the ambient dimension—substantially improving upon the standard O(d) cost. Experiments demonstrate 2–10× speedup over state-of-the-art methods in high-dimensional sparse settings, with minimal implementation overhead and no hyperparameter tuning required.
📝 Abstract
In was recently established that for convex optimization problems with a sparse optimal solution (may it be entry-wise sparsity or matrix rank-wise sparsity) it is possible to have linear convergence rates which depend on an improved mixed-norm condition number of the form $frac{β_1{}s}{α_2}$, where $β_1$ is the $ell_1$-Lipchitz continuity constant of the gradient, $α_2$ is the $ell_2$-quadratic growth constant, and $s$ is the sparsity of the optimal solution. However, beyond the improved convergence rate, these methods are unable to leverage the sparsity of optimal solutions towards improving also the runtime of each iteration, which may still be prohibitively high for high-dimensional problems. In this work, we establish that linear convergence rates which depend on this improved condition number can be obtained using only sparse updates, which may result in overall significantly improved running times. Moreover, our methods are considerably easier to implement.