I have SNPs information for 500 individuals (samples). Given a SNP, for each individual is available the number of minor alleles measured in that sample (i.e 0,1 or 2). I need to calculate the linkage disequilibrium between each pair of SNPs. I've found this formula:
where A (or B) and a (or b) are the two possible alleles at one locus. P(xy) denotes the frequency of observing x and y together in the same haplotype, P(x) denotes the frequency of x.
I defined the frequencies as:
where #(0) is the number of individuals with a value of 0 for the specific SNP. Is it the right way to calculate the linkage disequilibrium using my available data?