My usual pipeline for SNP calling is to MarkDuplicate and run GATK HaplotypeCaller. I have read somewhere on the GATK forum that HaplotypeCaller ignores reads that are marked as duplicates, however I think there is something to be learned from the duplicate reads. Ultimately they are a re-sequencing of the same read, so in theory, you can use all the duplictae reads to build a consensus read sequence and then more accurately call SNPs.
I think this is something that some SNP softwares must take advantage of, but I can't seem to find any with the key words I have been using to search. Are there SNP tools that take into consideration the marked duplicate reads? Or are there any tools available that will build consensus reads from duplicate reads, which you can then feed into any SNP caller?
Many thanks in advance!