Before executing 'samtools rmdup' to remove PCR duplicates from sam/bam file, I had executed 'samtools fixmate' as recommended somewhere. The lines in original sam/bam file of a pair of pair-end reads(ex. 6qyfyza) were shown like:
6qyfyza 67 1 182426 0 26M = 182628 203
6qyfyza 131 1 182628 0 22M = 182426 -203
but after doing 'samtools fixmate', were changed like:
6qyfyza 65 1 182426 0 26M = 182628 202
6qyfyza 129 1 182628 0 22M = 182426 -202
where not only the insert size(9th column) was modified from 203 to 202, but also bitwise flag(2nd column)was changed from 67 to 65 or 131 to 129. This happened to all the read pairs with 67 and 131 in the sam/bam file.
Is this really what was expected to happen? According to this site, it is explained that 65 and 129 means "mapped uniquely but wrong insert size, and could possibly reside in different contigs", so why am I getting such a bad flag by trying to fix insert size with samtools fixmate?
Thank you for clear answer. Because I needed to work on old bam files whose original fastq files are missing, I don't know why mapping is in such orientation.
I usually use picard MarkDupilicates which only didn't work for the old bam files with the error:
so I was thinking to use samtools rmdup for dedup. Now I will try samtools markdup. Thank you.