Question: Why does read with "read paired" flag not have mate in bam file?
0
gravatar for ionox0
5 months ago by
ionox0110
ionox0110 wrote:

The following read has 97 for its flags which indicates a paired read, but there is no other read in the bam with the same ID. This must be a error that occurred during some of the processing of this bam correct?

K00217:116:HM7N7BBXX:4:1101:12266:24859:GAA+TAT

97

chr14
105258971
60

121M

=

105259071

221

CGTCGCTCATGGTGCCCGAGGCTCCCGCGACGCTCACGCGCTCCTCTCAGGCTGGCGCTCCCCGAGCCCAGCTGGCCTGGCCACAGCCTCTGGGAGAAGCAAAGGAAGCTGAATGTGAGGC

JJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
NM:i:0

MD:Z:121

AS:i:121

XS:i:0

RG:Z:sample
sequencing mate bam paired • 240 views
ADD COMMENTlink modified 3 months ago by d-cameron1.8k • written 5 months ago by ionox0110
1

This must be a error that occurred during some of the processing of this bam correct?

Definitely. If you grep the read name from the raw sam file you will surely find both. 97 means:

  • paired
  • first in pair
  • mate in reverse strand

This means that the mate is also mapped and on the reverse strand. Perhaps you filtered this file before doing this operation? How did you filter it?

ADD REPLYlink written 5 months ago by Macspider2.4k

This must be a error that occurred during some of the processing of this bam correct?

yes

ADD REPLYlink written 5 months ago by Pierre Lindenbaum108k
2
gravatar for d-cameron
3 months ago by
d-cameron1.8k
Australia
d-cameron1.8k wrote:

The following read has 97 for its flags which indicates a paired read, but there is no other read in the bam with the same ID. This must be a error that occurred during some of the processing of this bam correct?

Unfortunately, there is nothing in the SAM specs that actually requires the mate read to exist. You file is still perfectly valid. That said, there many possible reasons your mate could be missing including:

  • The BAM file was filtered to include only specific regions of the genome
  • The BAM file was filtered to remove specific regions of the genome
  • Unmapped reads were removed
  • A de-duplication algorithm that was not read-pair aware was run
  • A downsampling algorithm that was not read-pair aware was run
  • The mate failed QC and was removed
  • The mate was aggressively base quality or adapter trimmed to nothing
  • Some reads were filter on the command-line (e.g. piping to grep)
  • Some other kind of filtering was performed in your pipeline
  • The mate actually is there, but you ran an algorithm that changed the alignment position so it's not in the position it's supposed to be according to your read (e.g. GATK indel realignment does this)
  • The mate wasn't there to start with
  • The SAM flag was changed and your read was never pair
  • The read name of either the read or the mate was changed

To work out why it's missing you're going to have to go back to the fastq files and work your way forward through each step in your pipeline.

ADD COMMENTlink written 3 months ago by d-cameron1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 849 users visited in the last hour