Statistically and Computationally Optimal Estimation and Inference of Common Subspaces

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This study addresses the estimation and statistical inference of a common subspace shared by multiple noisy symmetric low-rank matrices. The authors propose a projected gradient descent estimator initialized via a spectral sum-of-squares scheme and develop a corresponding inference procedure. They establish, for the first time, sharp optimality boundaries for both estimation and inference across different signal-to-noise ratio (SNR) regimes: under a strong estimation SNR, the estimator achieves the minimax optimal sinΘ error rate; under a strong inference SNR, they construct adaptive minimax optimal confidence intervals and prove that adaptive inference becomes information-theoretically impossible below this threshold.

📝 Abstract

Given multiple data matrices, many problems in statistics and data science rely on estimating a common subspace that captures certain structure shared by all the data matrices. In this paper we investigate the statistical and computational limits for the common subspace model in which one observes a collection of symmetric low-rank matrices perturbed by noise, where each low-rank matrix shares the same common subspace. Our main results identify several regimes of the signal-to-noise ratio (SNR) such that estimation and inference are statistically or computationally optimal, and we refer to these regimes as weak SNR, moderate SNR, strong estimation SNR, and strong inference SNR. First, we propose an estimator based on projected gradient descent initialized via spectral sum of squares and show that it achieves the optimal $\sinΘ$ error rate under strong estimation SNR. These results are complemented by both statistical and computational lower bounds identifying the weak and moderate estimation SNR regimes. Next, we turn to statistical inference for the $\sinΘ$ distance itself, and we show that our estimator has an asymptotically Gaussian distribution in the strong inference SNR regime. Based on this limiting result we propose confidence intervals and show that they are adaptively minimax optimal in the strong inference SNR regime, where adaptivity is measured in terms of the SNR. Finally, we show that adaptive confidence intervals are information-theoretically impossible below the strong inference SNR regime. Consequently, our results unveil a novel phenomenon: despite the SNR being ``above'' the computational limit for estimation, adaptive statistical inference may still be information-theoretically impossible.

Problem

Research questions and friction points this paper is trying to address.

common subspace

low-rank matrices

signal-to-noise ratio

statistical inference

estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

common subspace estimation

statistical optimality

computational limits