🤖 AI Summary
This paper investigates the robustness of the Lasso estimator in high-dimensional sparse linear regression under ill-conditioned design matrices: when non-support columns of the design matrix $X$ exhibit arbitrary correlations, conventional restricted eigenvalue (RE) conditions often fail, leading to uncontrolled prediction error. To address this, we propose a “partially rotated design” semi-random modeling framework and introduce a novel condition—Restricted Normalized Orthogonality (RNO)—which ensures the RE constant depends solely on the minimum eigenvalue of the submatrix corresponding to the true support, thereby eliminating interference from correlations among irrelevant features. Our theoretical analysis integrates random matrix theory, sparse optimization, and RE characterization; we construct a support-guided random rotation and rigorously prove RNO holds. Under this framework, the Lasso achieves prediction error $O(k log d / (lambda_{min} n))$, matching the rate under fully well-conditioned designs, while being completely robust to arbitrary correlations among non-support columns.
📝 Abstract
We consider the sparse linear regression model $mathbf{y} = X eta +mathbf{w}$, where $X in mathbb{R}^{n imes d}$ is the design, $eta in mathbb{R}^{d}$ is a $k$-sparse secret, and $mathbf{w} sim N(0, I_n)$ is the noise. Given input $X$ and $mathbf{y}$, the goal is to estimate $eta$. In this setting, the Lasso estimate achieves prediction error $O(k log d / gamma n)$, where $gamma$ is the restricted eigenvalue (RE) constant of $X$ with respect to $mathrm{support}(eta)$. In this paper, we introduce a new $ extit{semirandom}$ family of designs -- which we call $ extit{partially-rotated}$ designs -- for which the RE constant with respect to the secret is bounded away from zero even when a subset of the design columns are arbitrarily correlated among themselves. As an example of such a design, suppose we start with some arbitrary $X$, and then apply a random rotation to the columns of $X$ indexed by $mathrm{support}(eta)$. Let $lambda_{min}$ be the smallest eigenvalue of $frac{1}{n} X_{mathrm{support}(eta)}^ op X_{mathrm{support}(eta)}$, where $X_{mathrm{support}(eta)}$ is the restriction of $X$ to the columns indexed by $mathrm{support}(eta)$. In this setting, our results imply that Lasso achieves prediction error $O(k log d / lambda_{min} n)$ with high probability. This prediction error bound is independent of the arbitrary columns of $X$ not indexed by $mathrm{support}(eta)$, and is as good as if all of these columns were perfectly well-conditioned. Technically, our proof reduces to showing that matrices with a certain deterministic property -- which we call $ extit{restricted normalized orthogonality}$ (RNO) -- lead to RE constants that are independent of a subset of the matrix columns. This property is similar but incomparable with the restricted orthogonality condition of [CT05].