How does FreeBayes evaluates variants that did not meet the --min-alternate-count (-C) and --min-alternate-fraction (-F) criteria?
Entering edit mode
6.3 years ago
Joanne Lim ▴ 20

Hi all,

I am currently using FreeBayes (v0.9.14-18-g36789d8-dirty) to call for SNPs from a merged BAM file of 16 diploid plant samples . In my case, I would like FreeBayes to consider only variants that are supported by at least 5 alternate allele observations in a single sample ( --min-alternate-count 5) and also by at least 20% of the reads from a single sample (--min-alternate-fraction 0.2).  I was wondering what does FreeBayes do to the variants that did not meet the --min-alternate-count and –min-alternate-fraction criteria? 


freebayes -b BWT2_16Samples.merged.bam -f chr00.fasta -v BWT2_16Samples.merged.vcf --ploidy 2 --min-alternate-count 5 --min-alternate-fraction 0.2 --no-population-priors --min-mapping-quality 0

VCF output

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT s1.sorted s2.sorted s3.sorted s4.sorted s5.sorted s6.sorted s7.sorted s8.sorted s9.sorted s10.sorted s11.sorted s12.sorted s13.sorted s14.sorted s15.sorted s16.sorted
chr00   822     .       T       A       878.663 .   AB=0.857143;ABP=10.7656;AC=23;AF=0.884615;AN=26;AO=28;CIGAR=1X;DP=31;DPB=31;DPRA=2.5;EPP=14.1779;EPPR=3.0103;GTI=0;LEN=1;MEANALT=1.08333;MQM=2.89286;MQMR=3;NS=13;NUMALT=1;ODDS=0.312408;PAIRED=0.964286;PAIREDR=0.5;PAO=0;PQA=0;PQR=0;PRO=0;QA=1015;QR=69;RO=2;RPP=7.97367;RPPR=7.35324;RUN=1;SAF=22;SAP=22.8638;SAR=6;SRF=1;SRP=3.0103;SRR=1;TYPE=snp   GT:DP:RO:QR:AO:QA:GL    .       1/1:1:0:0:1:39:-3.9,-0.30103,0  1/1:3:0:0:2:75:-6.77333,-0.60206,0      1/1:2:0:0:2:71:-6.745,-0.60206,0        .       1/1:1:0:0:1:41:-4.1,-0.30103,0       .       0/1:7:1:38:6:201:-10,0,-2.53789 1/1:1:0:0:1:34:-3.4,-0.30103,0  1/1:1:0:0:1:38:-3.8,-0.30103,0  1/1:2:0:0:2:76:-7.22,-0.60206,0 0/0:1:1:31:0:0:0,-0.30103,-3.1       1/1:4:0:0:4:150:-10,-1.20412,0  1/1:3:0:0:3:107:-9.98667,-0.90309,0     1/1:2:0:0:2:69:-6.555,-0.60206,0        1/1:3:0:0:3:114:-10,-0.90309,0

From the VCF output shown above,  it seems that the only sample that fulfilled the --min-alternate-count criteria is s8.sorted as there are >=5 alternate allele counts. Other samples like s2.sorted and s3.sorted have only 1 and 2 alternate allele counts respectively and are still being printed in the output, shouldn't they get filtered away?

Thanks for your help in advance.






SNP freebayes • 4.0k views
Entering edit mode
6.2 years ago
SNPsaurus ▴ 50

The min-alternate-count flag is used to set a threshold for which an allele is evaluated in the population. So as long as --min-alternate-total samples meet the threshold then the allele is used at that site. It looks like you want to filter away individual genotype calls that don't meet the 5 depth threshold. You can do that after the vcf is made with vcffilter -g "DP > 5" for example.


Login before adding your answer.

Traffic: 2223 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6