[Disclaimer: I cross-posted on the FreeBayes Google group but it doesn't seem to get much activity]
I'm trying to call low-frequency variants in a pooled sample (C. elegans WGS, 20X coverage, aligned with BBMap using 'sam=1.3' flag ) and encountered an unexpected problem - which usually means I'm doing something wrong :-). Here's the command:
freebayes -f REFERENCE.FA -F 0.01 -C 1 DATA_SORTED.BAM > DATA.VCF
The VCF contains a number of long (e.g., 10K) complex deletions, which are supported by a few (2-3) lower map-quality (2-18) reads. The deletions overlap SNPs that are supported by more (8-12) higher quality (>20) reads. The problem is that the complex indel is called but not the SNPs, even though the data provide superior support for the SNP call. One example using SAMtools 'tview' is shown below (reads supporting complex deletion in green/blue; asterisk=deletion).
I realize that I can filter on map quality but I'm trying to maximize sensitivity to true-positive SNP calls (which I can validate by prior annotation), and filtering reduces sensitivity. I've tried using the prior annotation VCF (-@ flag) but that didn't help, and the '--no-complex flag' just changed the TYPE to del. And obviously post-VCF filtering cannot recover variant calls that are missing.
Should I be using a different set of criteria/flags to call these overlapping SNPs?