Why does read with "read paired" flag not have mate in bam file?
1
0
Entering edit mode
3.8 years ago
ionox0 ▴ 360

The following read has 97 for its flags which indicates a paired read, but there is no other read in the bam with the same ID. This must be a error that occurred during some of the processing of this bam correct?

K00217:116:HM7N7BBXX:4:1101:12266:24859:GAA+TAT

97

chr14
105258971
60

121M

=

105259071

221

CGTCGCTCATGGTGCCCGAGGCTCCCGCGACGCTCACGCGCTCCTCTCAGGCTGGCGCTCCCCGAGCCCAGCTGGCCTGGCCACAGCCTCTGGGAGAAGCAAAGGAAGCTGAATGTGAGGC

JJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
NM:i:0

MD:Z:121

AS:i:121

XS:i:0

RG:Z:sample

bam paired mate sequencing • 1.4k views
1
Entering edit mode

This must be a error that occurred during some of the processing of this bam correct?

Definitely. If you grep the read name from the raw sam file you will surely find both. 97 means:

• paired
• first in pair
• mate in reverse strand

This means that the mate is also mapped and on the reverse strand. Perhaps you filtered this file before doing this operation? How did you filter it?

0
Entering edit mode

This must be a error that occurred during some of the processing of this bam correct?

yes

2
Entering edit mode
3.6 years ago
d-cameron ★ 2.3k

The following read has 97 for its flags which indicates a paired read, but there is no other read in the bam with the same ID. This must be a error that occurred during some of the processing of this bam correct?

Unfortunately, there is nothing in the SAM specs that actually requires the mate read to exist. You file is still perfectly valid. That said, there many possible reasons your mate could be missing including:

• The BAM file was filtered to include only specific regions of the genome
• The BAM file was filtered to remove specific regions of the genome
• A de-duplication algorithm that was not read-pair aware was run
• A downsampling algorithm that was not read-pair aware was run
• The mate failed QC and was removed
• The mate was aggressively base quality or adapter trimmed to nothing
• Some reads were filter on the command-line (e.g. piping to grep)
• Some other kind of filtering was performed in your pipeline
• The mate actually is there, but you ran an algorithm that changed the alignment position so it's not in the position it's supposed to be according to your read (e.g. GATK indel realignment does this)
• The SAM flag was changed and your read was never pair
• The read name of either the read or the mate was changed

To work out why it's missing you're going to have to go back to the fastq files and work your way forward through each step in your pipeline.