Question: Strange output of samtools flag stat?
gravatar for heking
16 months ago by
heking0 wrote:

Hi I aligned my single end fastq files using HISAT2

    hisat2 --dta -x /mnt/lustre/users/k1632479/grcm38/genome -U /mnt/lustre/users/k\
1632479/ESC_NSC/13799X1_161209_D00294_0278_BCAEAJANXX_1.fastq.gz -S /mnt/lustre\

I thought it successfully aligned due to this output

 52586374 reads; of these:
  52586374 (100.00%) were unpaired; of these:
    4123360 (7.84%) aligned 0 times
    33940663 (64.54%) aligned exactly 1 time
    14522351 (27.62%) aligned >1 times
92.16% overall alignment rate

However when I run,

samtools flagstat 13799X1_test.sam
60276818 + 0 in total (QC-passed reads + QC-failed reads)
7690444 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
56153458 + 0 mapped (93.16% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr

This is the output, does anyone know why?

hisat2 rna-seq samtools mouse • 444 views
ADD COMMENTlink modified 16 months ago by Istvan Albert ♦♦ 84k • written 16 months ago by heking0

why you think it has not aligned successfully based on samtool flagstat output? compare both the output.

ADD REPLYlink modified 16 months ago • written 16 months ago by Prakash1.9k

I agree with prakesh. I see no significant discrepancy between the two outputs. The alignment is fine.

ADD REPLYlink modified 16 months ago • written 16 months ago by swbarnes27.7k
gravatar for Istvan Albert
16 months ago by
Istvan Albert ♦♦ 84k
University Park, USA
Istvan Albert ♦♦ 84k wrote:

One lesson that I have learned (the hard way) that it is challenging (sometimes impossible) to precisely reproduce the statistics generated by different tools.

Words such as "mapped", "singletons", "total" are not well defined. For example, in this case "what is total?": the number of reads, the number of pairs or the number of alignments? Did any other filtering take place? Here the first tool reports read numbers, the second tool reports alignments. The number of primary alignments is then:

60276818 - 7690444 = 52586374

which matches the number reported in the first tool. This time it was relatively easy to figure this out. Once you have more complex alignments, and more flags are reported it becomes a lot harder to figure out.

ADD COMMENTlink modified 16 months ago • written 16 months ago by Istvan Albert ♦♦ 84k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1915 users visited in the last hour