🤖 AI Summary
This paper addresses the fundamental problem of efficiently generating approximate samples from a target distribution (Q_ heta) given a single Gaussian observation with unknown mean ( heta), and leverages this to establish computational complexity lower bounds for high-dimensional statistical models. We propose a generic reduction framework based on Gaussian sources, enabling the first computationally efficient reductions to non-Gaussian location models (e.g., generalized normal, t-distributions) and nonlinear (k)-sparse generalized linear models (including phase retrieval). Key contributions include: (i) a rigorous characterization of the statistical–computational gap in (k)-sparse phase retrieval, resolving the conjecture of Cai et al. by showing the tight transition from (k) to (k^2); (ii) establishing computational equivalence among symmetric mixture linear regression, sparse rank-one submatrix detection, and generalized linear models; and (iii) validating the universality of Gaussian lower bounds and deriving exact sharp phase transition thresholds across multiple model classes.
📝 Abstract
Given a single observation from a Gaussian distribution with unknown mean $ heta$, we design computationally efficient procedures that can approximately generate an observation from a different target distribution $Q_{ heta}$ uniformly for all $ heta$ in a parameter set. We leverage our technique to establish reduction-based computational lower bounds for several canonical high-dimensional statistical models under widely-believed conjectures in average-case complexity. In particular, we cover cases in which: 1. $Q_{ heta}$ is a general location model with non-Gaussian distribution, including both light-tailed examples (e.g., generalized normal distributions) and heavy-tailed ones (e.g., Student's $t$-distributions). As a consequence, we show that computational lower bounds proved for spiked tensor PCA with Gaussian noise are universal, in that they extend to other non-Gaussian noise distributions within our class. 2. $Q_{ heta}$ is a normal distribution with mean $f( heta)$ for a general, smooth, and nonlinear link function $f:mathbb{R}
ightarrow mathbb{R}$. Using this reduction, we construct a reduction from symmetric mixtures of linear regressions to generalized linear models with link function $f$, and establish computational lower bounds for solving the $k$-sparse generalized linear model when $f$ is an even function. This result constitutes the first reduction-based confirmation of a $k$-to-$k^2$ statistical-to-computational gap in $k$-sparse phase retrieval, resolving a conjecture posed by Cai et al. (2016). As a second application, we construct a reduction from the sparse rank-1 submatrix model to the planted submatrix model, establishing a pointwise correspondence between the phase diagrams of the two models that faithfully preserves regions of computational hardness and tractability.