High-dimensional covariance regression with application to co-expression QTL detection

📅 2024-04-02
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the fundamental question in co-expression QTL (co-eQTL) studies: *how genetic variants (e.g., SNPs) dynamically regulate gene co-expression networks*. To this end, we propose the first interpretable and inferential high-dimensional conditional covariance regression model, where the covariance matrix varies nonparametrically with individual-level high-dimensional covariates (e.g., multi-SNP genotypes). Methodologically, we introduce a joint ℓ₁/ℓ₂ regularization to enforce sparse-group structure, design a block-wise coordinate descent algorithm with theoretical convergence guarantees, and integrate debiased estimation for valid high-dimensional statistical inference. We establish, for the first time, the optimal ℓ₁/ℓ₂ convergence rate and construct asymptotically exact confidence intervals. Applied to glioblastoma gene expression and genotype data, our method successfully identifies SNP-regulated dynamic co-expression edges, demonstrating both modeling efficacy and biological interpretability in capturing genetic regulation of co-expression networks.

Technology Category

Application Category

📝 Abstract
While covariance matrices have been widely studied in many scientific fields, relatively limited progress has been made on estimating conditional covariances that permits a large covariance matrix to vary with high-dimensional subject-level covariates. In this paper, we present a new sparse covariance regression framework that models the covariance matrix as a function of subject-level covariates. In the context of co-expression quantitative trait locus (QTL) studies, our method can be used to determine if and how gene co-expressions vary with genetic variations. To accommodate high-dimensional responses and covariates, we stipulate a combined sparsity structure that encourages covariates with non-zero effects and edges that are modulated by these covariates to be simultaneously sparse. We approach parameter estimation with a blockwise coordinate descent algorithm, and investigate the $ell_1$ and $ell_2$ convergence rate of the estimated parameters. In addition, we propose a computationally efficient debiased inference procedure for uncertainty quantification. The efficacy of the proposed method is demonstrated through numerical experiments and an application to a gene co-expression network study with brain cancer patients.
Problem

Research questions and friction points this paper is trying to address.

Estimating conditional covariances with high-dimensional covariates
Modeling gene co-expression variations with genetic factors
Developing sparse regression for high-dimensional responses and covariates
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse covariance regression with subject-level covariates
Blockwise coordinate descent for parameter estimation
Debiased inference for uncertainty quantification
🔎 Similar Papers
No similar papers found.