Hello Everyone,

I have a VCF with multiple individuals from multiple populations and I would like to get a summary of the allele frequency spectrum for each population. I know that VCFtools has some nice options for outputting allele frequencies. However my data is from non-model organisms and I think using the reference/derived alleles for calculating allele frequencies is resulting in some serious biases.

So I am looking for two possible solutions:

1) How to calculate the folded allele frequency spectrum (not biased by ancestral/reference allele assumptions) from a VCF file as starting point.

2) How to calculate the allele frequency spectrum using an outgroup species to infer the ancestral allele. Are there any packages out there for this? It does not seem trivial to me to infer which allele is presumed ancestral and to incorporate this into the VCF file to then calculate allele frequency spectrum based on ancestral/derived alleles.

Any comments or ideas are much appreciated. Thank you in advance!

Best

Rubal

Thanks this looks great, is there a way to specify which individuals in the vcf file I want to include in the calculation?

Cheers