A Randomized Algorithm for Sparse PCA based on the Basic SDP Relaxation

📅 2025-07-12

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Sparse Principal Component Analysis (SPCA) is NP-hard, and existing algorithms struggle to simultaneously achieve computational efficiency and theoretical guarantees. This paper proposes a randomized approximation algorithm based on the canonical semidefinite programming (SDP) relaxation—marking the first integration of SDP relaxation with randomized rounding for SPCA. Under the generalized spiked Wishart model and mild technical assumptions, we prove that the algorithm achieves an approximation ratio of $O(log d)$ with high probability; this ratio becomes nearly optimal in low-rank settings or when eigenvalues decay rapidly. Moreover, the approximation ratio’s upper bound is explicitly controlled by the sparsity parameter. Empirical evaluation on multiple real-world datasets confirms the algorithm’s effectiveness: it significantly improves computational efficiency while providing rigorous performance guarantees. Our work establishes a new paradigm for SPCA that balances scalability with provable accuracy.

Technology Category

Application Category

📝 Abstract

Sparse Principal Component Analysis (SPCA) is a fundamental technique for dimensionality reduction, and is NP-hard. In this paper, we introduce a randomized approximation algorithm for SPCA, which is based on the basic SDP relaxation. Our algorithm has an approximation ratio of at most the sparsity constant with high probability, if called enough times. Under a technical assumption, which is consistently satisfied in our numerical tests, the average approximation ratio is also bounded by $mathcal{O}(log{d})$, where $d$ is the number of features. We show that this technical assumption is satisfied if the SDP solution is low-rank, or has exponentially decaying eigenvalues. We then present a broad class of instances for which this technical assumption holds. We also demonstrate that in a covariance model, which generalizes the spiked Wishart model, our proposed algorithm achieves a near-optimal approximation ratio. We demonstrate the efficacy of our algorithm through numerical results on real-world datasets.

Problem

Research questions and friction points this paper is trying to address.

Develops randomized algorithm for NP-hard Sparse PCA

Provides approximation guarantees under technical conditions

Validates algorithm efficacy on real-world datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Randomized algorithm for Sparse PCA

Based on basic SDP relaxation

Achieves near-optimal approximation ratio

🔎 Similar Papers

No similar papers found.

Authors to Follow