Robust Rank Estimation for Noisy Matrices

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing criteria for rank estimation in noisy and outlier-contaminated matrices suffer either from insufficient robustness or high computational cost. This paper proposes the Density-Power-Divergence-based Criterion for Matrix Rank (DICMR), the first to incorporate density power divergence (DPD) into rank selection. DICMR achieves strong robustness—specifically, first-order B-robustness—while maintaining high computational efficiency via a closed-form solution that avoids iterative optimization, data splitting, or resampling. We formulate a model selection objective based on DPD and characterize the asymptotic probability of rank misestimation through rigorous theoretical analysis. Empirical evaluations demonstrate that DICMR attains accuracy comparable to robust cross-validation on principal component analysis and matrix completion tasks, yet with substantially reduced computational overhead. Moreover, on microarray data imputation, DICMR outperforms multiple state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
Estimating the true rank of a noisy data matrix is a fundamental problem underlying techniques such as principal component analysis, matrix completion, etc. Existing rank estimation criteria, including information-based and cross-validation methods, are either highly sensitive to outliers or computationally demanding when combined with robust estimators. This paper proposes a new criterion, the Divergence Information Criterion for Matrix Rank (DICMR), that achieves both robustness and computational simplicity. Derived from the density power divergence framework, DICMR inherits the robustness properties while being computationally very simple. We provide asymptotic bounds on its overestimation and underestimation probabilities, and demonstrate first-order B-robustness of the criteria. Extensive simulations show that DICMR delivers accuracy comparable to the robustified cross-validation methods, but with far lower computational cost. We also showcase a real-data application to microarray imputation to further demonstrate its practical utility, outperforming several state-of-the-art algorithms.
Problem

Research questions and friction points this paper is trying to address.

Estimating true rank of noisy data matrices robustly
Addressing computational complexity in robust rank estimation
Developing robust alternative to outlier-sensitive rank criteria
Innovation

Methods, ideas, or system contributions that make the work stand out.

DICMR criterion achieves robustness and computational simplicity
Derived from density power divergence for inherent robustness
Provides asymptotic bounds and B-robustness with low cost