🤖 AI Summary
In large-scale recommender systems, low-rank matrix factorization with missing entries suffers from high computational cost in Alternating Least Squares (ALS) due to repeated full-data regressions. This paper proposes an element-wise core subset selection method: guided by theoretical analysis, it identifies critical observed entries that significantly contribute to parameter updates, and performs sparse-matrix-based iterative optimization over this subset. The method preserves convergence guarantees and approximation accuracy while drastically reducing per-iteration computational complexity. We derive theoretical error bounds and sufficient sampling conditions for accurate recovery. Experiments on multiple real-world and synthetic datasets demonstrate that the approach achieves recommendation accuracy comparable to full ALS using only 10%–30% of its runtime. This work establishes a verifiable, highly efficient new paradigm for large-scale sparse matrix factorization.
📝 Abstract
In this paper, we propose a novel element-wise subset selection method for the alternating least squares (ALS) algorithm, focusing on low-rank matrix factorization involving matrices with missing values, as commonly encountered in recommender systems. While ALS is widely used for providing personalized recommendations based on user-item interaction data, its high computational cost, stemming from repeated regression operations, poses significant challenges for large-scale datasets. To enhance the efficiency of ALS, we propose a core-elements subsampling method that selects a representative subset of data and leverages sparse matrix operations to approximate ALS estimations efficiently. We establish theoretical guarantees for the approximation and convergence of the proposed approach, showing that it achieves similar accuracy with significantly reduced computational time compared to full-data ALS. Extensive simulations and real-world applications demonstrate the effectiveness of our method in various scenarios, emphasizing its potential in large-scale recommendation systems.