Hello everyone,
I have the allele frequencies at many loci of 26 populations. I was wondering if anybody knows how to calculate the Fst for each variation. I am trying to find variations that are characteristics of a certain population/ethnicity (by selecting those variations with Fst < 0.05).
So for example, if I have 5 populations (A, B, C, D, E) and the allele frequencies at one locus (see below), how can I find the Fst for each variation at that position?
A->G A->T A->AC
A 0 0.94 0.05
B 0.07 0.15 0.1
C 0.8 0.1 0.04
D 0.1 0.05 0.2
E 0.03 0.1 0.15
------------------------------
Fst ?? ?? ??
I hope my understanding of Fst is correct. If not, please correct me.
Thanks
I'm not sure what do you mean. What is you ultimate goal, considering my reply above? Just to clarify - you have mentioned ethnicity, so I guess this is human data. What type of data is it, mtDNA, Y, autosomal? P.S. if you need to clarify or ask something additionally, you should add a comment instead of posting a new answer with a question:)
I am trying to find loci with significant variations in the genomes. For example in the data I posted in the question, I would say there is significant variation at this locus because only population A has a high frequency for A->T variation and population C has a high frequency for A->G variation. I am trying to find an appropriate statistical test to somehow analyze my data (in the form given in the question above), and I thought Fst (or chi square test) would be good (maybe I am wrong). There are many R packages that calculate Fst but they take alignments as input (which I don't have). Maybe there is another statistical test I should use that you guys know of?
Thanks for you help and time :)