Question: What is the state of the art for GWAS in terms of statistical algorithm for either Case/control and Quantitative traits?
1
gravatar for b.ambrozio
6 days ago by
b.ambrozio10
b.ambrozio10 wrote:

Hello! I'm trying to understand what is the best algorithm for GWAS nowadays. I know we have many tools available like Plink and Hail, but currently, what is the best algorithm if I won't use any them? Let's say, write down a script in R or Python from scratch. Which statistical algorithm should I use? Is it linear mixed models (LMMs)? I'm confused as we can have binary phenotypes (case/control) or quantitative phenotypes. LMM seems to address quantitative ones, but can it be used for case/control as well? Actually, what is the state of the art for both/each of them? Pair-reviewed papers as references will be appreciated. Thanks!

lmm gwas • 159 views
ADD COMMENTlink modified 5 days ago by chrchang5236.1k • written 6 days ago by b.ambrozio10

Actually, what is the state of the art for both/each of them?

That would be Plink.

ADD REPLYlink written 6 days ago by WouterDeCoster42k
4
gravatar for chrchang523
5 days ago by
chrchang5236.1k
United States
chrchang5236.1k wrote:

The main regression executed by Plink was introduced by EIGENSTRAT in ~2006; see https://www.nature.com/articles/ng1847 . This is actually straightforward to write in R/Python from scratch; the harder part is optimizing the implementation for large datasets.

The Firth regression added to Plink 2.0 to improve handling of rare variants and imbalanced binary phenotypes was motivated by https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4049324/ .

Mixed linear models provide better statistical power when you have lots of close relatives in your dataset, but are much trickier to solve; actually, this is still a significant research area. Two tools covering parts of the current state-of-the-art are SAIGE (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6119127/ ; handles imbalanced binary phenotypes, but relatively slow) and fastGWA (https://www.nature.com/articles/s41588-019-0530-8 ; great speed, but doesn't support dosage data yet and uses a misspecified model for binary phenotypes).

ADD COMMENTlink modified 5 days ago • written 5 days ago by chrchang5236.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 861 users visited in the last hour