Question: Thorough Documentation Of Samtools Flagstat
0
gravatar for Biomonika (Noolean)
5.5 years ago by
State College, PA, USA
Biomonika (Noolean)3.0k wrote:

I am trying to find thoroughfull documentation of samtools flagstat for every row for file Un.bam

samtools flagstat Un.bam
8444 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
8154 + 0 mapped (96.57%:nan%)
8444 + 0 paired in sequencing
4383 + 0 read1
4061 + 0 read2
5973 + 0 properly paired (70.74%:nan%)
7793 + 0 with itself and mate mapped
361 + 0 singletons (4.28%:nan%)
1428 + 0 with mate mapped to a different chr
337 + 0 with mate mapped to a different chr (mapQ>=5)

samtools view -F 4 -q 1 -f 2 -b Un.bam >filtered.bam
samtools flagstat filtered.bam 
4613 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
4613 + 0 mapped (100.00%:nan%)
4613 + 0 paired in sequencing
2330 + 0 read1
2283 + 0 read2
4613 + 0 properly paired (100.00%:nan%)
4613 + 0 with itself and mate mapped
0 + 0 singletons (0.00%:nan%)
61 + 0 with mate mapped to a different chr
53 + 0 with mate mapped to a different chr (mapQ>=5)

EDIT: without demanding mapQ>1

samtools view -F 4 -f 2 -b Un.bam >filtered.bam
[mqm5775@hammer22 bam]$ samtools flagstat filtered.bam 
5973 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
5973 + 0 mapped (100.00%:nan%)
5973 + 0 paired in sequencing
3010 + 0 read1
2963 + 0 read2
5973 + 0 properly paired (100.00%:nan%)
5973 + 0 with itself and mate mapped
0 + 0 singletons (0.00%:nan%)
207 + 0 with mate mapped to a different chr
53 + 0 with mate mapped to a different chr (mapQ>=5)

Since I asked for reads mapped in proper pair I would expect read1 and read2 numbers to be the same, but maybe I am just misunderstanding what these rows mean. Could someone please point me to the documentation or explain this issue? Thanks a lot.

samtools • 3.9k views
ADD COMMENTlink modified 5.5 years ago by geek_y9.4k • written 5.5 years ago by Biomonika (Noolean)3.0k
2
gravatar for Istvan Albert
5.5 years ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

As far as I know the flagstat command is a convenience function that produces the same values you would get by selecting the right filters on a samtools view command.

For example properly paired means:

 samtools view -c -f 3 data.bam

then itself and mate mapped is:

 samtools view -c -F 12 data.bam

etc

Is it possible that this alignment contains both paired and unpaired reads? It seems that even your original file has more of read1 than read2?

samtools view -c -f 64 data.bam
samtools view -c -f 128 data.bam

should produce the same numbers that match the flagstat output.

ADD COMMENTlink modified 5.5 years ago • written 5.5 years ago by Istvan Albert ♦♦ 80k
1
gravatar for BruceB
5.5 years ago by
BruceB330
Cambridge, UK
BruceB330 wrote:

The -q 1 part excludes reads with a mapping quality of less than 1 (default value is 0). This may be excluding some reads, so try it again without that part.

I have tested this on several of my BAM files and they all come out as having equal numbers of read 1 and read 2, as you would expect.

ADD COMMENTlink written 5.5 years ago by BruceB330

I rerun flagstat without demand for mapq >=1 and values still differ. Should I assume that something is already wrong with my Un.bam file? If yes, can you guess what it could be? :)

ADD REPLYlink modified 5.5 years ago • written 5.5 years ago by Biomonika (Noolean)3.0k

Could you give us more details about how the BAM file has been generated.

ADD REPLYlink written 5.5 years ago by BruceB330

I was bwa mem with paired-end reads with -M option (Mark shorter split hits as secondary (for Picard compatibility).)

ADD REPLYlink written 5.5 years ago by Biomonika (Noolean)3.0k
1
gravatar for geek_y
5.5 years ago by
geek_y9.4k
Barcelona/CRG/London/Imperial
geek_y9.4k wrote:

If you got this bam file from bowtie2, and if you choose the option --no-mixed while aligning the data, you would end up only with reads that have corresponding mates. Hence you can expect exact pair of reads. Its better if you give some details about your alignment process.

ADD COMMENTlink written 5.5 years ago by geek_y9.4k

You are right, I used bwa mem for my alignment. I now think that these unequal numbers for read1 and read2 may be result of split alignment. I will rerun it again to confirm this. I used -M option (Mark shorter split hits as secondary (for Picard compatibility)).

ADD REPLYlink modified 5.5 years ago • written 5.5 years ago by Biomonika (Noolean)3.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 762 users visited in the last hour