formula to calculate LD in plink
1
0
Entering edit mode
9.6 years ago

Could any kind of person tell me the formula to calculate Linkage disequilibrium in the plink. The results calculated by plink is different from that calculated by the script I wrote which uses the R2 = (ad - bc)^2/((a + b)(a+c)(c+d)(b+d)). Thank you.

LD plink • 4.8k views
ADD COMMENT
1
Entering edit mode
9.6 years ago

See the "correlation coefficient" definition under http://en.wikipedia.org/wiki/Linkage_disequilibrium#Definition, and the discussion at http://pngu.mgh.harvard.edu/~purcell/plink/ld.shtml#ld2. The basic r^2 computation involves correlation between the 0/1/2 allele counts instead of haplotype frequencies, but you can also tell plink to estimate haplotype frequencies and use the standard formula on them (results will rarely differ by much).

ADD COMMENT
1
Entering edit mode

@chrchang523 Please, could you provide a numerical example? How PLINK recode snp (two columns) into one numeric value? Is 11=0; 22=2; 12=21=1? And then, how correlation is calculated? Thank you for your help!

EDIT: I think I found how plink caculates de r2. The program counts the number of copies of the allele with the minor freq in each SNP and then calculates de correlation for this count:

snp11 snp12   snp21 snp22  counts_in_snp1   counts_in_snp2
  1     2       2     2          1                0
  2     2       2     2          0                0
  1     1       1     2          2                1
  1     2       2     2          1                0
...

And then, correlation between counts_in_snp1 and counts_in_snp2.

ADD REPLY

Login before adding your answer.

Traffic: 2453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6