Question: Strange output of samtools flag stat?
0
gravatar for heking
4 months ago by
heking0
heking0 wrote:

Hi I aligned my single end fastq files using HISAT2

    hisat2 --dta -x /mnt/lustre/users/k1632479/grcm38/genome -U /mnt/lustre/users/k\
1632479/ESC_NSC/13799X1_161209_D00294_0278_BCAEAJANXX_1.fastq.gz -S /mnt/lustre\
/users/k1632479/ESC_NSC/13799X1_test.sam

I thought it successfully aligned due to this output

 52586374 reads; of these:
  52586374 (100.00%) were unpaired; of these:
    4123360 (7.84%) aligned 0 times
    33940663 (64.54%) aligned exactly 1 time
    14522351 (27.62%) aligned >1 times
92.16% overall alignment rate

However when I run,

samtools flagstat 13799X1_test.sam
60276818 + 0 in total (QC-passed reads + QC-failed reads)
7690444 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
56153458 + 0 mapped (93.16% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr

This is the output, does anyone know why?

hisat2 rna-seq samtools mouse • 254 views
ADD COMMENTlink modified 4 months ago by Istvan Albert ♦♦ 80k • written 4 months ago by heking0

why you think it has not aligned successfully based on samtool flagstat output? compare both the output.

ADD REPLYlink modified 4 months ago • written 4 months ago by Prakash1.2k

I agree with prakesh. I see no significant discrepancy between the two outputs. The alignment is fine.

ADD REPLYlink modified 4 months ago • written 4 months ago by swbarnes25.8k
4
gravatar for Istvan Albert
4 months ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

One lesson that I have learned (the hard way) that it is challenging (sometimes impossible) to precisely reproduce the statistics generated by different tools.

Words such as "mapped", "singletons", "total" are not well defined. For example, in this case "what is total?": the number of reads, the number of pairs or the number of alignments? Did any other filtering take place? Here the first tool reports read numbers, the second tool reports alignments. The number of primary alignments is then:

60276818 - 7690444 = 52586374

which matches the number reported in the first tool. This time it was relatively easy to figure this out. Once you have more complex alignments, and more flags are reported it becomes a lot harder to figure out.

ADD COMMENTlink modified 4 months ago • written 4 months ago by Istvan Albert ♦♦ 80k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1643 users visited in the last hour