Question: Samtools flagstat number of reads do not match actual total number of reads.
1
gravatar for Vijay Lakhujani
2.4 years ago by
Vijay Lakhujani3.1k
India
Vijay Lakhujani3.1k wrote:

samtools flagstat results:

some number + 0 in total (QC-passed reads + QC-failed reads)

my this "some number" does not match actual number of reads in my paired fastq file. Is this expected?

samtools flagstat • 1.7k views
ADD COMMENTlink modified 2.4 years ago by dariober9.5k • written 2.4 years ago by Vijay Lakhujani3.1k
3
gravatar for Devon Ryan
2.4 years ago by
Devon Ryan85k
Freiburg, Germany
Devon Ryan85k wrote:

The number of reads reported is the number actually in the file. If your aligner produced secondary alignments then this will often be higher than the original number in the fastq files.

ADD COMMENTlink written 2.4 years ago by Devon Ryan85k

Also, the aligner may throw out unaligned reads.

ADD REPLYlink written 2.4 years ago by igor6.8k
3
gravatar for dariober
2.4 years ago by
dariober9.5k
Glasgow - UK
dariober9.5k wrote:

My 2p: It's good to keep in mind that SAM/BAM stores alignments not reads (in fact, sam stands for sequence alignment/map) even if unmapped reads can be present.

For this reason the simple question "how many reads have been aligned?" can be tricky to answer. A simple strategy is to count the reads that have not been aligned and get the difference with the raw read count from fastq. But if you want to know how many reads have been aligned with certain criteria (e.g. mapq > x, alignment score > y, properly paired etc) than you should consider also split reads.

I suspect the use of the word "read" in samtools flagstat causes a lot of misunderstandings in this respect.

ADD COMMENTlink written 2.4 years ago by dariober9.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1183 users visited in the last hour