Principal Components Decomposition of Fraction of Variance Explained in High Dimensional Linear Models with Strong Correlation

📅 2026-06-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

253K/year
🤖 AI Summary
This study addresses the substantial bias in existing proportion of variance explained (FVE) estimators—such as those from GWAS or LMM-REML—when predictors exhibit strong correlations in high-dimensional linear models. To mitigate this issue, the authors propose a two-component FVE estimation framework based on principal component decomposition, which partitions covariates into a low-dimensional subspace of strongly correlated variables and a high-dimensional complement of weakly correlated ones, each handled with tailored estimation strategies. The resulting estimator effectively reduces bias induced by high-dimensional strong correlation structures and enjoys desirable asymptotic consistency. Extensive simulations and real-data analysis using the ABCD neuroimaging cohort demonstrate that the proposed method significantly improves FVE estimation accuracy and more reliably captures heritability signals underlying cognitive phenotypes.
📝 Abstract
The fraction of variance explained (FVE) in a linear model quantifies the extent to which predictors account for outcome variability. In high-dimensional settings, where traditional FVE estimators do not apply, modern FVE estimators such as GWASH or linear mix-effect model estimated through the restricted maximum likelihood (LMM-REML) struggle with strong correlation among predictors, often found, for example, in brain imaging data. We propose a decomposition framework that partitions the FVE into two components: a low-dimensional component capturing the strong correlation, estimable by low dimensional methods, and a high-dimensional component with remaining weak correlation, estimable by high dimensional methods. Simulations demonstrate that decomposing dominant principal components (PCs) and estimating the high-dimensional FVE using GWASH or LMM-REML leads to improved bias reduction compared to directly applying standard approaches such as GWASH and LMM-REML. Our method shows consistent performance asymptotically as both the number of predictors and the number of samples increase. We illustrate the method in an analysis of the Adolescent Brain Cognitive Development (ABCD) brain imaging dataset, capturing nuanced heritability signals in the FVE of cognitive measures predicted by high-resolution brain imaging data.
Problem

Research questions and friction points this paper is trying to address.

fraction of variance explained
high-dimensional linear models
strong correlation
predictor correlation
FVE estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fraction of Variance Explained
Principal Components Decomposition
High-Dimensional Linear Models
Strong Correlation
Heritability Estimation
🔎 Similar Papers
No similar papers found.
Man Luo
Man Luo
University of Science and Technology of China
C
Chun Chieh Fan
Center for Population Neuroscience and Genetics, Laureate Institute for Brain Research, Tulsa, OK, USA; Department of Radiology, School of Medicine, University of California San Diego, La Jolla, CA, USA
D
David Azriel
Technion–Israel Institute of Technology, Haifa, Israel
Armin Schwartzman
Armin Schwartzman
Professor, University of California, San Diego
Signal and image analysismanifold-valued datarandom fieldsbrain imagingenvironment