variant analysis using Freebayes
21 months ago
bitpir ▴ 240

Hi all,

I ran freebayes using the following command line:

##freebayes -b <bam> -f hg38.noalt.fa -v <vcf> --min-alternate-count 3 --min-alternate-fraction 0.2 -t b.bed"

One of the samples have this following output:

chr1    179109038   .   A   C   0.014237    .   AB=0.235294;ABP=13.3567;AC=1;AF=0.5;AN=2;AO=4;CIGAR=1X;DP=17;DPB=17;DPRA=0;EPP=11.6962;EPPR=3.17734;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=60;NS=1;NUMALT=1;ODDS=5.71882;PAIRED=1;PAIREDR=0.846154;PAO=0;PQA=0;PQR=0;PRO=0;QA=108;QR=460;RO=13;RPL=4;RPP=11.6962;RPPR=3.17734;RPR=0;RUN=1;SAF=4;SAP=11.6962;SAR=0;SRF=0;SRP=31.2394;SRR=13;TYPE=snp;technology.ILLUMINA=1    GT:DP:AD:RO:QR:AO:QA:GL 0/1:17:13,4:13:460:4:108:-4.86964,0,-36.5987

And the other one has this:

chr1    179109038   .   A   C   51.8359 .   AB=0.448276;ABP=3.68421;AC=1;AF=0.5;AN=2;AO=13;CIGAR=1X;DP=29;DPB=29;DPRA=0;EPP=31.2394;EPPR=3.55317;GTI=0;LEN=1;MEANALT=1;MQM=51.4615;MQMR=60;NS=1;NUMALT=1;ODDS=11.9357;PAIRED=1;PAIREDR=0.9375;PAO=0;PQA=0;PQR=0;PRO=0;QA=343;QR=601;RO=16;RPL=13;RPP=31.2394;RPPR=3.55317;RPR=0;RUN=1;SAF=13;SAP=31.2394;SAR=0;SRF=2;SRP=22.5536;SRR=14;TYPE=snp;technology.ILLUMINA=1  GT:DP:AD:RO:QR:AO:QA:GL 0/1:29:16,13:16:601:13:343:-19.822,0,-45.6764

This variant is a false-positive when compared to NA12878 benchmark.vcf. I am unclear as to how one sample has low QUAL and the other a high QUAL. The BQ and MAPQ for these 2 samples are very similar -- 35 and 57, respectively. Per Freebayes best-practices, I should do vcffilter based on the QUAL, but I would like to understand more what contributes to the QUAL score in this situation?

Any idea would be greatly appreciated! Thanks!

calling freebayes false variant positives
We know that false positive issue happens in variant caling tools. You can read benchmarking papers about this. Nevertheless, I would consider AD and DP field in vcf file as well. Maybe you should filter your vcf using these definitions (DP, MAPQ) instead QUAL.


