How does plink calculates linkage disequilibrium (r2) in unphased data?
1
0
Entering edit mode
2.4 years ago
serpalma.v ▴ 70

Hello

when I pass a vcf file to plink, it can calculate the linkage disequilibrium (r2) even though the data is not phased. How is this possible when the haplotype frequencies for SNP pairs are not known?

Thanks!

plink SNP sequencing • 1.1k views
0
Entering edit mode
2.4 years ago

For each SNP, you can write a {0, 1, 2}-valued vector of REF allele counts. The unphased-r2 reported by plink is the square of the sample Pearson correlation between these two vectors.

0
Entering edit mode

Would the result have the same interpretation as r^2=D^2/(p1p2q1q2)?