When I use pircard MarkDuplicates to remove Duplicates from the BWA result, I get this error in MarkDuplicates: Exception in thread "main" net.sf.picard.PicardException: Value was put into PairInfoMap more than once. 1: CTTGTA_L006:HWI-ST1148:55:C1CKAACXX:6:1308:16351:60596 Does anyone know how to solve it?
Prefix the read IDs with the lane number before merging the files.
MarkDuplicates will not consider two reads to be duplicates if they have different read group IDs, so assigning different read group IDs will address the immediate problem
Fix your SM tag before merging the BAM files
This problem might have been solved two years ago with Raony's suggestion. It's a common problem when the FASTQ are merged from more than one lane, the read names may overlap and Picard gets mad if a read is in the BAM twice.
I've had this PicardException arise for another reason. Samples from MiSeq with some low quality bases, run with BWA-MEM, can be placed into the BAM more than once with equivalent quality scores. Then Picard MD sees for example two forward reads and one reverse with the same read name. While BWA-MEM should be reporting only the best quality alignment, (at least for version 0.7.3) it can report more than one alignment when soft clipping allows for equivalent quality scores.
Funny that MarkDuplicates cant handle technical duplicates, but the solution is to filter with samtools to keep only properly paired reads.