Question: Fst Uneven Populations
0
gravatar for Lee Katz
6.5 years ago by
Lee Katz3.0k
Atlanta, GA
Lee Katz3.0k wrote:

Hi, I was wondering what to do with FST (the fixation index) when the two populations are different. How does it change the calculation of

(differencesBetween - differencesWithin) / differencesBetween

Optimally, if you have a subroutine in Perl for Fst that considers different population sizes, it would solve my problem. Thank you!

• 3.7k views
ADD COMMENTlink modified 6.5 years ago by Fabio Marroni2.5k • written 6.5 years ago by Lee Katz3.0k
2

In my opinion different sample sizes shouldn't be a big issue (unless one population is REALLY small). There are several estimators of Fst. You might refer to some very basic approach, such as the one ususally used to teach to undergad students. In that case I alway find useful this page by David McDonald at University of Wyoming. A useful discussion for you is here on biostars: Wright's Fst and Weir & Cockerham's Fst estimator - simple explanation of the difference

ADD REPLYlink modified 6.5 years ago • written 6.5 years ago by Fabio Marroni2.5k
1

I think this should be an answer, not a comment. it may not be fulfilling all the question's needs completely, but it's still very informative. Giovanni's answer on that post linked is a great Fst reading.

ADD REPLYlink written 6.5 years ago by Jorge Amigo11k

@Jorge. I pasted my comment as an answer. I read on github that currently moderators cannot convert comments to anwsers, so I guessed I had to do it... Should I remove my comment now?

ADD REPLYlink written 6.5 years ago by Fabio Marroni2.5k

to be honest, neither do I! if a more experienced moderator finds it necessary I'm sure he'll do that for you without losing your upvotes. I've already changed my upvote to the answer instead of leaving it on the comment in case it helps in any way.

ADD REPLYlink modified 6.5 years ago • written 6.5 years ago by Jorge Amigo11k

Thanks, good answer!

ADD REPLYlink written 6.5 years ago by Lee Katz3.0k
2
gravatar for Fabio Marroni
6.5 years ago by
Fabio Marroni2.5k
Italy
Fabio Marroni2.5k wrote:

In my opinion different sample sizes shouldn't be a big issue (unless one population is REALLY small). There are several estimators of Fst. You might refer to some very basic approach, such as the one ususally used to teach to undergad students. In that case I alway find useful this page by David McDonald at University of Wyoming. A useful discussion for you is here on biostars: Wright's Fst and Weir & Cockerham's Fst estimator - simple explanation of the difference

ADD COMMENTlink written 6.5 years ago by Fabio Marroni2.5k

I comment on my answer because in these days I am working on Fst. I found several useful papers dealing with unbiased estimators in case of subpopulations with unequal size. One (http://bit.ly/1ivq9m4) also refers to the R package DEMEtics for calculations of unbiased estimators. Another one is the original work by Nei and Chesser including adjustments for unalanced sizes (http://bit.ly/1cWo4Z3). Finally, some researchers simply use weighted averages of Hs instead of aritmetic mean. While the latter might work well (and I am using it), I cannot guarantee that it results in an unbiased estimate.

ADD REPLYlink written 6.5 years ago by Fabio Marroni2.5k
1
gravatar for Jorge Amigo
6.5 years ago by
Jorge Amigo11k
Santiago de Compostela, Spain
Jorge Amigo11k wrote:

we developed a tool years ago to calculate Fst using the ideas given by this example and migrating those formulas into perl. this is the perl code we use for calculate Fst depending on different population and population group sizes:

# GENETIC DIFFERENTIATION (FOR POPULATION GROUPS)
# HT eq groupHexp
# HS = sum( popHexp x popN ) / groupN
# Fst = ( HT - HS ) / HT
$HT = $Hexp{$group};
$HS = 0;
$Fst = 0;
if ($HT != 0) {
    foreach $pop (@pops) { $HS += $Hexp{$pop} * $N{$pop}; }
    $HS /= $N{$group}
    $Fst = ( $HT - $HS ) / $HT;
}

where Hexp is the local expected heterozygosity of each subpopulation, HS is the sum of all Hexp multiplied by each population size, and HT is the expected heterozygosity for the entire group of populations considered, all of which you have to previously calculate.

ADD COMMENTlink modified 6.5 years ago • written 6.5 years ago by Jorge Amigo11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1684 users visited in the last hour