Question: Linkage Disequilibrium to Rare SNPs
2
gravatar for Shicheng Guo
4.1 years ago by
Shicheng Guo8.2k
Shicheng Guo8.2k wrote:

Hi colleagues,

Just very simple question. When we calculate LD, we need P(A),P(a),P(B),P(b) and then

D=P(AB)-P(A)P(B)
r=P(AB)-P(A)P(B)/sqrt(P(A)P(a)P(B)P(b))

However, for raw SNPs or mutation, P(A) or P(B) might be 0. In such situation, r can not be calculated. Is there any compromised way to calculate r or D' for such situation?

In another way, suppose I only observe 1 haplotype for 2 loci in all the samples (very large samples)? Can I take the such as as completed linkage or I can take record LD as 'NA'?

Thanks

ld linkage disequilibrium • 1.1k views
ADD COMMENTlink modified 3.6 years ago by Biostar ♦♦ 20 • written 4.1 years ago by Shicheng Guo8.2k
1
gravatar for Fabio Marroni
4.1 years ago by
Fabio Marroni2.5k
Italy
Fabio Marroni2.5k wrote:

Technically, if the frequency of one of the alleles is zero, one of your loci is not polymorphic, and it makes no sense compute LD for those two loci. So, NA is appropriate. Do not take 1 as a value, since that would be a mistake. Considering that LD expresses the difference between the observed haplotype frequencies and those expected based on allele frequencies, you can realize that when you miss one allele at one locus (i.e. one locus is not polymorphic) by definition you always have that the haplotype frequencies are exactly those expected based on the allele frequencies, i.e. you don't have any information to test if there is non-random association of alleles in the two loci and again, NA is an appropriate answer.

ADD COMMENTlink written 4.1 years ago by Fabio Marroni2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2005 users visited in the last hour