I am interested in computing an FST measure for the whole genome. I am implementing the FST Reynolds formula (1983). I found this paper on Genetics with a formula for a per site as well as a per region FST measure:
Where a stands for the between genetic differentiation and b for the within genetic differentiation. The formula is easy to apply to a region, you just sum these values for all the sites within your region.
My questions is, if you would like to estimate a per-genome estimate, is it OK to just use this second formula using all the sites in your genome?
Also, in several programs like PLINK you can get a weighted or unweighted estimate of FST. What is the difference between these two? I assume the weighted estimate would be similar to the second formula I am showing? whereas the unadjusted is just the mean of all sites?