Question: Samtools flagstat results
0
3.3 years ago by
banerjeeshayantan190 wrote:
``````221372 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
20419 + 0 duplicates
218469 + 0 mapped (98.69% : N/A)
155851 + 0 paired in sequencing
142663 + 0 properly paired (91.54% : N/A)
150045 + 0 with itself and mate mapped
2903 + 0 singletons (1.86% : N/A)
4938 + 0 with mate mapped to a different chr
2120 + 0 with mate mapped to a different chr (mapQ>=5)
``````

I have the following questions:
1. From previous posts I understood that read1 may not be equal to read2, as there may be reads whose mates didn't align. These are singletons. So doesn't that mean that read2-read1 must be equal to singletons? What am i missing here?
2. What does the field "with itself and mate mapped mean?"

next-gen • 2.1k views
modified 3.3 years ago by Devon Ryan97k • written 3.3 years ago by banerjeeshayantan190

What preprocessing steps have been applied to get this file ?

I wrote this command: samtools flagstat example.bam. Does this help?

1
3.3 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:
1. Singletons occur when only one mate in a pair aligns. You can also have situations where one mate aligns multiple times (e.g., to a simple repeat) and the other only once. Then one will have a single entry and the other may have multiple. Also, if you did any filtering then that'd affect this as well.
2. It means exactly what is says, both mates mapped. They may be "properly paired" or they may not be. Regardless, if they both align somewhere at least once then they count toward this.

When I subtract number of reads mapped from the total number of reads (221372-218469=2903), which is the number of singletons. So can I say that singletons are the reads which didn't map to any reference?

By definition a singleton cannot be unmapped. If it is, it's not a singleton.

Thanks for your reply. Can you explain what you meant by "You can also have situations where one mate aligns multiple times (e.g., to a simple repeat) and the other only once.". I know it is a trivial question, but I am entirely new to this field. Hence asking.

1

Suppose you have the sequence `ATATATATATATATATATAAGCGCTAGCTAGTCGATCTAGCTAGCTGATCGGTCGTCAGAC`. You might have reads `ATATATAT` and `GCGCTAGC`. The latter read can only align to one place in that sequence. The former read can align equally well to multiple places. Consequently, some aligners will produce multiple entries for `ATATATAT` and a single one for `GCGCTAGC`.