Question: LD pruning and other QC before association analysis ?
gravatar for Picasa
4.9 years ago by
Picasa590 wrote:


Usually we perform QC stuffs such as removing SNP with a low degree of MAF, LD pruning etc. before a PCA.

But should we have to do these QC filtering before an association analysis ? By association analysis, I mean classical test such as case/control association with SNP etc. as explained here:

qc gwas • 8.1k views
ADD COMMENTlink modified 4.8 years ago by Nick60 • written 4.9 years ago by Picasa590
gravatar for Shab86
4.8 years ago by
Shab86270 wrote:

The answer is yes and more ! I am putting up links for two tutorial papers for QC before GWAS. Hope it helps. Refs:

ADD COMMENTlink written 4.8 years ago by Shab86270

Here is the link to slides that better translated the nature protocol:

ADD REPLYlink written 3.3 years ago by Jerry Zhu60
gravatar for Nick
4.8 years ago by
United Kingdom
Nick60 wrote:

These steps are necessary before PCA in order to identify the principal dimensions of genetic variation between samples, without over-weighting the contribution of groups of correlated SNPs.

PCA is just one of the QC steps you should perform to prepare data for case/control association testing. It may be used to establish whether samples are of common ancestry and you might want to exclude outlier samples. Other QC steps would include removing SNPs with low genotype calling score (e.g. GenTrain score and cluster separation score in GenomeStudio); removing SNPs and samples with low call rate; removing SNPs which fail the HWE test; checking inferred gender vs recorded gender; removing one of each pair of related samples (for unrelated case-control design); removing outlier samples of heterozygosity/inbreeding test. See for example this protocol for exome chip QC.

Assuming you have performed these QC steps and are left with a clean dataset, you should perform case-control analysis without LD pruning. You can also include SNPs with low MAF but your analysis may have low power to detect significant rare SNPs (afterwards you can perform a QQ test to check that the assumptions of your statistical association test are satisified). It is important to check cluster plots for any significant SNPs you find (genotypes can be particularly difficult to call correctly for rare SNPs).

ADD COMMENTlink written 4.8 years ago by Nick60
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1760 users visited in the last hour