Question: r2 correlation interpretation snp in plink pruning
gravatar for Floris Brenk
6.6 years ago by
Floris Brenk970
Floris Brenk970 wrote:

Hi all,

Plink has the function "Linkage disequilibrium based SNP pruning" which is --indep 50 5 2 where the 2 stands for the vif threshold (VIF is 1/(1-R^2)) which means in this case r2 = 0,50.

So Linkage disequilibrium is the non-random association of alleles. I'm a bit strugling what for example an r2 of 0.5 means and how plink calculates this. Does 0.5 just mean a correlation of 0.5 between two snps? Can anyone explain to me a bit more what this 0.5 actually mean in real numbers. For example when I have 100 samples how many snps need to be in perfect LD to reach a r2 of 0.5?



pruning r2 plink snp • 6.7k views
ADD COMMENTlink modified 6.6 years ago by chrchang5237.6k • written 6.6 years ago by Floris Brenk970
gravatar for chrchang523
6.6 years ago by
United States
chrchang5237.6k wrote:

It's the squared correlation coefficient between the 0/1/2 allele counts.  I.e. ((Cov(marker 1 allele counts, marker 2 allele counts))^2) / (Var(marker 1 allele counts) * Var(marker 2 allele counts)).

The "number of SNPs in perfect LD" required depends on the exact distribution of allele counts for each marker; for example, if both markers have empirical MAF 0.01, even 99 samples in "perfect LD" is not enough to guarantee r^2 >= 0.5.

ADD COMMENTlink written 6.6 years ago by chrchang5237.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1104 users visited in the last hour