I’m trying to call phased variants in 400-450bp amplicons derived from pooled samples that may contain multiple haplotypes.
To do thismy plan is to sequence amplicons in a 2X300 bp MiSeq run, merged reads 1 & 2, and then aligned to a reference.
For haplotype calling, my initial plan was to use freebayes, which I have used quite successfully in the past to call (unphased) variants. According to its documentation and some posts I’ve read online, it is capable of calling haplotypes (ostensibly through the use of the “-E” flag) and, furthermore, does not assume (if properly set) that samples are from diploid sources (which they are not). Unfortunately, all my attempts to use freebayes to call haplotypes on simulated data (where haplotypes are known) have failed, though the component variants are called. I’ve written to the freebayes author and also posted my problem online ( freebayes problem example ), but have received no response.
So, if no one can offer a solution to my freebayes problem, can anyone suggest a tool well suited to my need? I was considering Platypus, however it assumes samples come from diploid organisms and so may not have the sensitivity I need. Also, I cannot use GATK.
Thanks for your help