Question: r2 correlation interpretation snp in plink pruning
gravatar for Floris Brenk
4.9 years ago by
Floris Brenk890
Floris Brenk890 wrote:

Hi all,

Plink has the function "Linkage disequilibrium based SNP pruning" which is --indep 50 5 2 where the 2 stands for the vif threshold (VIF is 1/(1-R^2)) which means in this case r2 = 0,50.

So Linkage disequilibrium is the non-random association of alleles. I'm a bit strugling what for example an r2 of 0.5 means and how plink calculates this. Does 0.5 just mean a correlation of 0.5 between two snps? Can anyone explain to me a bit more what this 0.5 actually mean in real numbers. For example when I have 100 samples how many snps need to be in perfect LD to reach a r2 of 0.5?



pruning r2 plink snp • 5.2k views
ADD COMMENTlink modified 4.9 years ago by chrchang5235.0k • written 4.9 years ago by Floris Brenk890
gravatar for chrchang523
4.9 years ago by
United States
chrchang5235.0k wrote:

It's the squared correlation coefficient between the 0/1/2 allele counts.  I.e. ((Cov(marker 1 allele counts, marker 2 allele counts))^2) / (Var(marker 1 allele counts) * Var(marker 2 allele counts)).

The "number of SNPs in perfect LD" required depends on the exact distribution of allele counts for each marker; for example, if both markers have empirical MAF 0.01, even 99 samples in "perfect LD" is not enough to guarantee r^2 >= 0.5.

ADD COMMENTlink written 4.9 years ago by chrchang5235.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2006 users visited in the last hour