🤖 AI Summary
This paper addresses the fundamental question in co-expression QTL (co-eQTL) studies: *how genetic variants (e.g., SNPs) dynamically regulate gene co-expression networks*. To this end, we propose the first interpretable and inferential high-dimensional conditional covariance regression model, where the covariance matrix varies nonparametrically with individual-level high-dimensional covariates (e.g., multi-SNP genotypes). Methodologically, we introduce a joint ℓ₁/ℓ₂ regularization to enforce sparse-group structure, design a block-wise coordinate descent algorithm with theoretical convergence guarantees, and integrate debiased estimation for valid high-dimensional statistical inference. We establish, for the first time, the optimal ℓ₁/ℓ₂ convergence rate and construct asymptotically exact confidence intervals. Applied to glioblastoma gene expression and genotype data, our method successfully identifies SNP-regulated dynamic co-expression edges, demonstrating both modeling efficacy and biological interpretability in capturing genetic regulation of co-expression networks.
📝 Abstract
While covariance matrices have been widely studied in many scientific fields, relatively limited progress has been made on estimating conditional covariances that permits a large covariance matrix to vary with high-dimensional subject-level covariates. In this paper, we present a new sparse covariance regression framework that models the covariance matrix as a function of subject-level covariates. In the context of co-expression quantitative trait locus (QTL) studies, our method can be used to determine if and how gene co-expressions vary with genetic variations. To accommodate high-dimensional responses and covariates, we stipulate a combined sparsity structure that encourages covariates with non-zero effects and edges that are modulated by these covariates to be simultaneously sparse. We approach parameter estimation with a blockwise coordinate descent algorithm, and investigate the $ell_1$ and $ell_2$ convergence rate of the estimated parameters. In addition, we propose a computationally efficient debiased inference procedure for uncertainty quantification. The efficacy of the proposed method is demonstrated through numerical experiments and an application to a gene co-expression network study with brain cancer patients.