🤖 AI Summary
This work investigates the computational feasibility of invariant learning in multi-environment settings, demonstrating that even when statistically identifiable invariant structures exist, efficiently eliminating spurious correlations may remain intractable. By constructing a samplable family of multi-environment instances and integrating average-case complexity, the invariant risk minimization framework, and a local Gaussian regularity assumption, the authors introduce an environment diversity parameter γ to characterize identifiability and target curvature. They theoretically establish the existence of instances that are polynomially sample-learnable yet admit no efficient algorithm, derive a minimax risk bound of Θ(k(d−k)/(n|ℰ|)), and uncover a phase transition phenomenon governed by the interplay among sample size, number of environments, and ambient dimension, with a threshold scaling as n* ∝ k(d−k)/(|ℰ|γ²). Empirical experiments corroborate these theoretical predictions.
📝 Abstract
Invariant learning can fail even when the invariant structure is statistically identifiable. We show a conditional computational barrier: under a black-box samplable supervised sparse recovery primitive motivated by average-case sparse-recovery reductions, there exist \emph{samplable} multi-environment instances with a one-dimensional predictive invariant subspace ($k=1$) that are learnable with polynomial samples by exhaustive search, while any polynomial-time constant-accuracy recovery algorithm would contradict the primitive. We further quantify environment diversity by a separation parameter $γ$, which controls identifiability and the curvature of invariance objectives. Under sufficient diversity and local Gaussian regularity, the minimax risk is $\mathbb{E}[\dist(\hat{V},V_{\mathrm{inv}})^2]=Θ(k(d-k)/(n|\mathcal{E}|))$, and under label-induced shifts a phase transition occurs at $n^*\propto k(d-k)/(|\mathcal{E}|γ^2)$ with refined estimation error scaling proportional to $1/γ^2$. Synthetic and real datasets illustrate the predicted gaps and transitions and motivate simple diversity diagnostics.