Finding total number of reads from BAM file
2
0
Entering edit mode
7.0 years ago

Hi all,

I have a bam file, and I would like to find the total number of reads (mapped + unmapped) from this file. I used the following command:

samtools idxstats myFile.bam

and received the following results:

chr1    248956422       15149505        0
chr2    242193529       5371045 0
chr3    198295559       5414053 0
chr4    190214555       2870332 0
chr5    181538259       7748857 0
chr6    170805979       6517305 0
chr7    159345973       7533417 0
chr8    145138636       3639326 0

It looks like the unmapped reads were removed. Is there any other way to find the initial number of reads from that bam file?

Thanks in advance:)

bam alignment RNA-Seq rna-seq next-gen • 13k views
ADD COMMENT
1
Entering edit mode
7.0 years ago

samtools view with option -c and the options -F and/or -f . Check the manual : http://www.htslib.org/doc/samtools.html

ADD COMMENT
0
Entering edit mode

Thank you. I'm not sure I did it correctly, but I tried the following:

samtools view -F -c myFile.bam | wc

and received:

[main_samview] truncated file.
22389498 385093622 6170795057

Does it mean that there is a problem with my file?

ADD REPLY
0
Entering edit mode

no -F takes an argument (a bit set of sam flags ) . Please, read the manual.

-f int Only output alignments with all bits set in INT present in the FLAG field. INT can be specified in hex by beginning with 0x' (i.e. /^0x[0-9A-F]+/) or in octal by beginning with0' (i.e. /^0[0-7]+/) [0].

-F INT Do not output alignments with any bits set in INT present in the FLAG field. INT can be specified in hex by beginning with 0x' (i.e. /^0x[0-9A-F]+/) or in octal by beginning with0' (i.e. /^0[0-7]+/) [0].

ADD REPLY
1
Entering edit mode
7.0 years ago

Maybe you can find useful the samtools flagstat command. Here goes an example of the information it provides

samtools flagstat your_sam_or_bam.file 

78413342 + 0 in total (QC-passed reads + QC-failed reads)
18152 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
600227 + 0 mapped (0.77%:-nan%)
78395190 + 0 paired in sequencing
39197595 + 0 read1
39197595 + 0 read2
471162 + 0 properly paired (0.60%:-nan%)
471162 + 0 with itself and mate mapped
117726 + 0 singletons (0.15%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
ADD COMMENT
0
Entering edit mode

Thanks. It looks like there is no data regarding the unmapped reads in this BAM file.

This is the output of samtools flagstat command:

22389498 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
22389498 + 0 mapped (100.00%:-nan%)
22389498 + 0 paired in sequencing
11258030 + 0 read1
11131468 + 0 read2
21610733 + 0 properly paired (96.52%:-nan%)
21610733 + 0 with itself and mate mapped
778765 + 0 singletons (3.48%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Am I right?

ADD REPLY
0
Entering edit mode

Yes, you have the 100% of mapped reads.

ADD REPLY
0
Entering edit mode

So there is no way to receive the initial number of reads from this file, correct?

ADD REPLY
0
Entering edit mode

the initial number of reads is 22389498. in any case you can find the initial number of reads using grep command or, for example, qualimap tool.

ADD REPLY

Login before adding your answer.

Traffic: 2071 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6