Univariate-Guided Sparse Regression

πŸ“… 2025-01-30
πŸ“ˆ Citations: 3
✨ Influential: 1
πŸ“„ PDF
πŸ€– AI Summary
To address the poor stability, weak interpretability, and stringent theoretical assumptions (e.g., the incoherence or irrepresentability condition) inherent in the Lasso for sparse regression, this paper proposes UniLassoβ€”a two-stage sparse regression method. Its core innovation lies in the first joint incorporation of sign consistency and magnitude information from univariate regression coefficients into modeling: Stage I retains the signs of univariate estimates, and Stage II optimizes a sparse solution under these sign constraints. This design circumvents reliance on the irrepresentability condition, substantially improving support recovery accuracy and consistency of prediction error. Theoretical analysis establishes statistical consistency under high-dimensional asymptotics, and the framework naturally extends to generalized linear models and the Cox proportional hazards model. Extensive simulations and real-data experiments demonstrate that UniLasso systematically outperforms standard Lasso in both sparsity identification accuracy and model interpretability.

Technology Category

Application Category

πŸ“ Abstract
In this paper, we introduce ``UniLasso'' -- a novel statistical method for sparse regression. This two-stage approach preserves the signs of the univariate coefficients and leverages their magnitude. Both of these properties are attractive for stability and interpretation of the model. Through comprehensive simulations and applications to real-world datasets, we demonstrate that UniLasso outperforms Lasso in various settings, particularly in terms of sparsity and model interpretability. We prove asymptotic support recovery and mean-squared error consistency under a set of conditions different from the well-known irrepresentability conditions for the Lasso. Extensions to generalized linear models (GLMs) and Cox regression are also discussed.
Problem

Research questions and friction points this paper is trying to address.

Develops UniLasso for sparse regression with sign preservation
Enhances model stability and interpretability via univariate coefficients
Proves asymptotic support recovery under unique conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage sparse regression method
Preserves univariate coefficient signs
Extends to GLMs and Cox regression
πŸ”Ž Similar Papers
No similar papers found.
S
Sourav Chatterjee
Department of Mathematics, Stanford University
Trevor Hastie
Trevor Hastie
Professor of Statistics, Stanford University
Statistical learning and modelingdata miningmachine learning
R
R. Tibshirani
Department of Statistics, Stanford University