Why Freebayes Allele frequency(AF) is 0.5 or 1.0, instead of reporting actual allele frequencies ?
1
6
Entering edit mode
7.1 years ago
gsr9999 ▴ 300

Hi Biostars Leaders,

Freebayes(version:v1.0.1-1-g683b3cc-dirty) defines AF as Description="Estimated allele frequency in the range (0,1]", but the values are always either 0.5 or 1.0, and they are not actual observed frequencies.

I have observed the same with GATK's HaplotypeCaller, and I have custom calculated the actual frequencies from Ref & Alt Alleles i.e. from the AD field.

Freebayes does not spit out the AD field, but it has these following fields which I think I can use : RO = "Reference allele observation count, with partial observations recorded fractionally" AO = "Alternate allele observations, with partial observations recorded fractionally"

I am wondering if there is any advice on how to calculate actual allele frequencies for Freebayes ?

thanks, gsr

next-gen • 5.4k views
ADD COMMENT
2
Entering edit mode

Some variant callers assume the reference is the human genome, with a ploidy of 2, and their heuristics act accordingly - so, they give incorrect results for genomes that don't have a ploidy of 2. I have not used Freebayes and I don't know whether it's possible to force it to correctly calculate allele frequencies, but BBMap's CallVariants tool (particularly in conjunction with BBMap for mapping) will correctly calculate and report the variant frequencies for SNPs. It's more complicated for indels, but it will also report an approximation of the correct result for insertions, with accuracy gradually decaying the longer they get with respect to read length (so, indel calls are correct, but the allele frequency correctness decreases with length, differentially between insertions and deletions). For example, 100bp reads would be highly accurate for 30bp insertions and 100kbp deletions, but not for 100kbp insertions.

ADD REPLY
0
Entering edit mode

Brian, thanks for the reply. But, I think my question is still unanswered.

ADD REPLY
2
Entering edit mode
7.1 years ago
gsr9999 ▴ 300

I got answer to my own question.

I just installed a newer version of freebayes (v1.1.0-3-g961e5f3-dirty) which does spit out the AD field "Number of observation for each allele"

I then used the AD field to calculate the actual allele frequencies in human sample(NA12878)

ADD COMMENT
1
Entering edit mode

You can accept your own answer, by the way.

ADD REPLY
0
Entering edit mode

oh, I didn't knew that, thanks Brian

ADD REPLY

Login before adding your answer.

Traffic: 2141 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6