Question: Snp Detection In Viruses: Does A Haploid Caller Like Freebayes Need To Be Used Instead Of Something Like Samtools/Bcftools?
I've done variant calling for many eukaryotic organisms, and always followed a simple, highly vetted pipeline of mapping my reads to a reference, generating pileup with samtools, and calling variants with bcftools. Now I'm interested in SNP calling for viral genomes, and although I know of several published papers that use this pipeline in virology, I heard that a haploid-specific algorithm, like that used in FreeBayes may be a more accurate analysis route.

Is this really the case? Does anyone know what negative effects will occur if using bcftools with viral sequence? Are there better options than FreeBayes?

Thanks for any advice.

Well, basically samtools' (and therefore bcftools') genotyping function assumes a diploid genome. SNV are usually limited/clipped at frequencies around 20% for that same reason, I believe. For viral genomes it's better to use programs like LoFreq, Breseq and SNVer and the like. These programs are especially useful if you're interested in low frequency variants.


