Question: What is the state of the art for GWAS in terms of statistical algorithm for either Case/control and Quantitative traits?
2
gravatar for b.ambrozio
8 months ago by
b.ambrozio20
b.ambrozio20 wrote:

Hello! I'm trying to understand what is the best algorithm for GWAS nowadays. I know we have many tools available like Plink and Hail, but currently, what is the best algorithm if I won't use any them? Let's say, write down a script in R or Python from scratch. Which statistical algorithm should I use? Is it linear mixed models (LMMs)? I'm confused as we can have binary phenotypes (case/control) or quantitative phenotypes. LMM seems to address quantitative ones, but can it be used for case/control as well? Actually, what is the state of the art for both/each of them? Pair-reviewed papers as references will be appreciated. Thanks!

lmm gwas • 597 views
ADD COMMENTlink modified 8 months ago by chrchang5237.1k • written 8 months ago by b.ambrozio20

Actually, what is the state of the art for both/each of them?

That would be Plink.

ADD REPLYlink written 8 months ago by WouterDeCoster44k
6
gravatar for chrchang523
8 months ago by
chrchang5237.1k
United States
chrchang5237.1k wrote:

The main regression executed by Plink was introduced by EIGENSTRAT in ~2006; see https://www.nature.com/articles/ng1847 . This is actually straightforward to write in R/Python from scratch; the harder part is optimizing the implementation for large datasets.

The Firth regression added to Plink 2.0 to improve handling of rare variants and imbalanced binary phenotypes was motivated by https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4049324/ .

Mixed linear models provide better statistical power when you have lots of close relatives in your dataset, but are much trickier to solve; actually, this is still a significant research area. Two tools covering parts of the current state-of-the-art are SAIGE (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6119127/ ; handles imbalanced binary phenotypes, but relatively slow) and fastGWA (https://www.nature.com/articles/s41588-019-0530-8 ; great speed, but doesn't support dosage data yet and uses a misspecified model for binary phenotypes).

ADD COMMENTlink modified 8 months ago • written 8 months ago by chrchang5237.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1368 users visited in the last hour