A Generalized Genetic Random Field Method for the Genetic Association Analysis of Sequencing Data

📅 2025-08-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address challenges in rare-variant association analysis—including threshold dependence, limited multi-trait modeling capability, and poor performance in small samples—this paper proposes the Generalized Genetic Random Field (GGRF) method. GGRF innovatively integrates the generalized estimating equations (GEE) framework with genetic similarity metrics (e.g., SKAT, SIMreg), yielding a random-effects model that obviates prespecified minor allele frequency thresholds and naturally accommodates diverse phenotypes (continuous, binary, etc.). It possesses desirable asymptotic properties and robust finite-sample performance. Simulation studies demonstrate that GGRF achieves statistical power comparable to or exceeding that of SKAT. Applied to real data from the Dallas Heart Study, GGRF successfully identified significant associations between serum triglyceride levels and rare variants in *ANGPTL3* and *ANGPTL4*, validating its effectiveness and practical utility for detecting rare-variant effects in complex diseases.

Technology Category

Application Category

📝 Abstract
With the advance of high-throughput sequencing technologies, it has become feasible to investigate the influence of the entire spectrum of sequencing variations on complex human diseases. Although association studies utilizing the new sequencing technologies hold great promise to unravel novel genetic variants, especially rare genetic variants that contribute to human diseases, the statistical analysis of high-dimensional sequencing data remains a challenge. Advanced analytical methods are in great need to facilitate high-dimensional sequencing data analyses. In this article, we propose a generalized genetic random field (GGRF) method for association analyses of sequencing data. Like other similarity-based methods (e.g., SIMreg and SKAT), the new method has the advantages of avoiding the need to specify thresholds for rare variants and allowing for testing multiple variants acting in different directions and magnitude of effects. The method is built on the generalized estimating equation framework and thus accommodates a variety of disease phenotypes (e.g., quantitative and binary phenotypes). Moreover, it has a nice asymptotic property, and can be applied to small-scale sequencing data without need for small-sample adjustment. Through simulations, we demonstrate that the proposed GGRF attains an improved or comparable power over a commonly used method, SKAT, under various disease scenarios, especially when rare variants play a significant role in disease etiology. We further illustrate GGRF with an application to a real dataset from the Dallas Heart Study. By using GGRF, we were able to detect the association of two candidate genes, ANGPTL3 and ANGPTL4, with serum triglyceride.
Problem

Research questions and friction points this paper is trying to address.

Analyzing high-dimensional sequencing data for genetic associations
Detecting rare variants' influence on complex human diseases
Developing a flexible method for diverse disease phenotypes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalized genetic random field method for sequencing data
Accommodates various phenotypes without threshold specification
Improved power for rare variant detection in disease
🔎 Similar Papers
No similar papers found.
M
Ming Li
Division of Biostatistics, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
Z
Zihuai He
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America
M
Min Zhang
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America
Xiaowei Zhan
Xiaowei Zhan
Professor of Materials Science, Peking University
Polymer ChemistryOrganic Electronics
Changshuai Wei
Changshuai Wei
LinkedIn
Machine LearningStatistics
R
Robert C Elston
Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, United States of America
Qing Lu
Qing Lu
Associate Professor, Division of Biostatistics, Department of Epidemiology and Biostatistics
statistical geneticsbioinformaticsgenetic epidemiology