Linkage Disequilibrium to Rare SNPs
1
2
Entering edit mode
8.0 years ago
Shicheng Guo ★ 9.4k

Hi colleagues,

Just very simple question. When we calculate LD, we need P(A),P(a),P(B),P(b) and then

D=P(AB)-P(A)P(B)
r=P(AB)-P(A)P(B)/sqrt(P(A)P(a)P(B)P(b))

However, for raw SNPs or mutation, P(A) or P(B) might be 0. In such situation, r can not be calculated. Is there any compromised way to calculate r or D' for such situation?

In another way, suppose I only observe 1 haplotype for 2 loci in all the samples (very large samples)? Can I take the such as as completed linkage or I can take record LD as 'NA'?

Thanks

LD Linkage Disequilibrium • 1.7k views
ADD COMMENT
1
Entering edit mode
8.0 years ago
Fabio Marroni ★ 3.0k

Technically, if the frequency of one of the alleles is zero, one of your loci is not polymorphic, and it makes no sense compute LD for those two loci. So, NA is appropriate. Do not take 1 as a value, since that would be a mistake. Considering that LD expresses the difference between the observed haplotype frequencies and those expected based on allele frequencies, you can realize that when you miss one allele at one locus (i.e. one locus is not polymorphic) by definition you always have that the haplotype frequencies are exactly those expected based on the allele frequencies, i.e. you don't have any information to test if there is non-random association of alleles in the two loci and again, NA is an appropriate answer.

ADD COMMENT

Login before adding your answer.

Traffic: 2346 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6