I'm processing whole genome BAM files. Since I'm specifically interested on chromosome 11, I have split my files and I'm working only on this chromosome. However, when I tried to run MarkDuplicates on chr11 bam files, it gave the following error:
SAM validation error: WARNING: Record 23, Read name IL21_1665:3:25:467:1485, Paired read should be marked as first of pair or second of pair.
Running it on ValidateSamFile produced hundreds of warnings with the same information. It also occurred with other Picard tools, such as FixMateInformation. At first, I thought the problem should be related to inter chromosomal pairs, where the information for one of the reads is not present on my bam file. Then, I saw this answered on Picard's FAQ page:
"If your reads have been divided into separate BAMs by chromosome, inter-chromosomal pairs will not be identified, but MarkDuplicates will not fail due to inability to find the mate pair for a read."
Right now, I'm confused and I don't know how to solve this. Should I run MarkDuplicates on the whole-genome file?