Question: Estimating FST per genome
0
gravatar for GabrielMontenegro
2.8 years ago by
United Kingdom
GabrielMontenegro530 wrote:

Hi,

I am interested in computing an FST measure for the whole genome. I am implementing the FST Reynolds formula (1983). I found this paper on Genetics with a formula for a per site as well as a per region FST measure:

https://s12.postimg.org/otjso4lct/fst.pngenter image description here

Where a stands for the between genetic differentiation and b for the within genetic differentiation. The formula is easy to apply to a region, you just sum these values for all the sites within your region.

My questions is, if you would like to estimate a per-genome estimate, is it OK to just use this second formula using all the sites in your genome?

Also, in several programs like PLINK you can get a weighted or unweighted estimate of FST. What is the difference between these two? I assume the weighted estimate would be similar to the second formula I am showing? whereas the unadjusted is just the mean of all sites?

Paper: http://www.genetics.org/content/genetics/early/2013/08/15/genetics.113.154740.full.pdf

fst next-gen genome • 1.4k views
ADD COMMENTlink modified 2.8 years ago by Zev.Kronenberg11k • written 2.8 years ago by GabrielMontenegro530
0
gravatar for Zev.Kronenberg
2.8 years ago by
United States
Zev.Kronenberg11k wrote:

For a genomic average I would just use Weir and Cockerham's FST (1984) for each site then build a distribution across the genome. You can also just take the average across the site FST values.

I've implemented this method in VCFLIB. If you're interested in learning more about FST I've tried to name all the variables to match the paper.

https://github.com/vcflib/vcflib/blob/master/src/wcFst.cpp

ADD COMMENTlink written 2.8 years ago by Zev.Kronenberg11k

Thanks for the reply! I will check the method in VCFLIB. Since you personally have implemented that FST estimation, I was wondering what to do with sites that are fixed between two populations. For the FST of Reynolds I was getting undefined values, but I assume it would be sensible to treat those as zero? Would you agree?

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by GabrielMontenegro530
1

You can only calculate FST for segregating sites.

    if(populationTarget->af == -1 || populationBackground->af == -1){
  delete populationTarget;
  delete populationBackground;
  continue;
}
if(populationTarget->af == 1 &&  populationBackground->af == 1){
  delete populationTarget;
      delete populationBackground;
  continue;
}
if(populationTarget->af == 0 &&  populationBackground->af == 0){
  delete populationTarget;
      delete populationBackground;
  continue;
}
ADD REPLYlink written 2.8 years ago by Zev.Kronenberg11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2204 users visited in the last hour