Question: Why Freebayes Allele frequency(AF) is 0.5 or 1.0, instead of reporting actual allele frequencies ?
2
gravatar for gsr9999
2.0 years ago by
gsr9999110
United States
gsr9999110 wrote:

Hi Biostars Leaders,

Freebayes(version:v1.0.1-1-g683b3cc-dirty) defines AF as Description="Estimated allele frequency in the range (0,1]", but the values are always either 0.5 or 1.0, and they are not actual observed frequencies.

I have observed the same with GATK's HaplotypeCaller, and I have custom calculated the actual frequencies from Ref & Alt Alleles i.e. from the AD field.

Freebayes does not spit out the AD field, but it has these following fields which I think I can use : RO = "Reference allele observation count, with partial observations recorded fractionally" AO = "Alternate allele observations, with partial observations recorded fractionally"

I am wondering if there is any advice on how to calculate actual allele frequencies for Freebayes ?

thanks, gsr

next-gen • 1.5k views
ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by gsr9999110
2

Some variant callers assume the reference is the human genome, with a ploidy of 2, and their heuristics act accordingly - so, they give incorrect results for genomes that don't have a ploidy of 2. I have not used Freebayes and I don't know whether it's possible to force it to correctly calculate allele frequencies, but BBMap's CallVariants tool (particularly in conjunction with BBMap for mapping) will correctly calculate and report the variant frequencies for SNPs. It's more complicated for indels, but it will also report an approximation of the correct result for insertions, with accuracy gradually decaying the longer they get with respect to read length (so, indel calls are correct, but the allele frequency correctness decreases with length, differentially between insertions and deletions). For example, 100bp reads would be highly accurate for 30bp insertions and 100kbp deletions, but not for 100kbp insertions.

ADD REPLYlink written 2.0 years ago by Brian Bushnell16k

Brian, thanks for the reply. But, I think my question is still unanswered.

ADD REPLYlink written 2.0 years ago by gsr9999110
1
gravatar for gsr9999
2.0 years ago by
gsr9999110
United States
gsr9999110 wrote:

I got answer to my own question.

I just installed a newer version of freebayes (v1.1.0-3-g961e5f3-dirty) which does spit out the AD field "Number of observation for each allele"

I then used the AD field to calculate the actual allele frequencies in human sample(NA12878)

ADD COMMENTlink written 2.0 years ago by gsr9999110
1

You can accept your own answer, by the way.

ADD REPLYlink written 2.0 years ago by Brian Bushnell16k

oh, I didn't knew that, thanks Brian

ADD REPLYlink written 2.0 years ago by gsr9999110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1828 users visited in the last hour