I aligned five 5x mate-pair libraries with different insert sizes to a 300M genome and started Pindel without the dispersed duplicates option on all BAM files. It's running for 10 days now on 30 cores and processeced around half of the genome.
Is there a way to speed up Pindel? E.g. de-duplication? Is that the expected runtime?
I read about processing the chromosomes individually. How do you deal with interchromosomal duplication in that case?