Question: Strange distribution of Fst values in BayeScan
gravatar for sackettl
4.8 years ago by
United States
sackettl20 wrote:

I have used BayeScan (2.1) successfully several times on various subsets of a large SNP dataset that I am currently working on.  Recently, I tried to use it to infer outliers between several population pairs using my whole dataset (57,000 SNPs), and came up with a peculiar result in all 9 population pairs.  The plot of posterior density for Fst looks as expected, except with a mean Fst ~ 0.4 - 0.6, depending on the population pair (not ~0.05 as I know them to be).  If I plot a frequency distribution of Fst values inferred by BayeScan, the distribution is the opposite of what I would expect -- the highest frequency is Fst at these high values around 0.5, with a tail on the left for low Fst values.  These are bird populations that we know are differentiated but still experience some gene flow, and are geographically quite close.  Calculating Fst in vcftools or Genepop gives a mean and distribution more in line with expectations -- mean ~0.05 with a tail to the right of higher Fst values.

Has anyone else experienced something similar? I have been picking my brain to try to figure out what is causing this to happen, but don't have any hypotheses yet.  The current dataset has less missing data than subsets I have previously used, so I wouldn't expect that to be an explanation.  I used PGDSpider to convert from vcf format to BayeScan format.

I would appreciate any thoughts!

fst snp bayescan outliers • 2.7k views
ADD COMMENTlink written 4.8 years ago by sackettl20

I'm having the same problem. Did you figure it out what's wrong? I've tried a couple of things, but I running out of ideas. At first I thought was something related to the conversion, but looking the files that worked, they are really similar. 

ADD REPLYlink written 4.4 years ago by henriquevf10

I'm having exactly the same issue. My BayeScan input files look the same, I use the same default settings (both using the GUI and command line versions), and 5/6 of my datasets result in Fst values of ~0.5 rather than ~0.005. I also can't see any differences in the .genepop files when I convert them to the .bayescan format in PGDSpider.

Did either of you (or anyone else?) find a solution?

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by aesculus10

Bump... Just seeing if any explanations...

ADD REPLYlink written 2.3 years ago by aesculus10

I did not ever find a solution to this. My only hypothesis is that the weird distribution may be somehow related to Wahlund effect / substructure in the data. I broke my populations into a couple of subpopulations and analyzed those separately. That worked for me, but it's not the best or most satisfying solution.

ADD REPLYlink written 19 months ago by sackettl20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 809 users visited in the last hour