How to interpret flagstat output
1
0
Entering edit mode
6 weeks ago

Hi, I have an issue similar to some published.

FastQC gives me 2,941,170 Total sequence, whereas the flagstat outputs these:

3510466 + 0 in total (QC-passed reads + QC-failed reads)
3508788 + 0 primary
0 + 0 secondary
1678 + 0 supplementary
0 + 0 duplicates
0 + 0 primary duplicates
3510466 + 0 mapped (100.00% : N/A)
3508788 + 0 primary mapped (100.00% : N/A)
3508788 + 0 paired in sequencing
3501708 + 0 properly paired (99.80% : N/A)
3505324 + 0 with itself and mate mapped
3464 + 0 singletons (0.10% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)


I cannot undestand the difference which is not explained by supplementary or unmapped reads. Thanks a lot.

2
Entering edit mode
6 weeks ago

It sums up like this:

3508788 +1678 = 3510466


there are no unmapped reads, some of your reads generate more than one alignment, these are the supplementary (chimeric) alignments. 1678 such supplementary alignments are present.

The data seems to have been filtered though, some pairs are broken, the unmapped mate is not present.

0
Entering edit mode

Thanks a lot Istvan! Even with these supplementary, I still cannot explain the higher number with respect to FastQC results.

0
Entering edit mode

I totally missed the FastQC number which seems to be the main point of the question. I focused on flagstat alone.

I don't what to say about that - the numbers should match, perhaps you running it on a different version of the file