Counting and extracting in a new file the unique mapped reads from BWA 0.7.15-r1140 mapping
1
0
Entering edit mode
6.5 years ago
valopes ▴ 30

Hello guys.

I am trying to have a bam file containing just the unique mapped reads from BWA.

I've used to check the mapping results:

samtools flagstat input.bam

and I got

151399675 + 0 in total (QC-passed reads + QC-failed reads)
1299223 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
150524060 + 0 mapped (99.42% : N/A)
150100452 + 0 paired in sequencing
75050226 + 0 read1
75050226 + 0 read2
141573428 + 0 properly paired (94.32% : N/A)
148964374 + 0 with itself and mate mapped
260463 + 0 singletons (0.17% : N/A)
6491734 + 0 with mate mapped to a different chr
5052530 + 0 with mate mapped to a different chr (mapQ>=5)

I know 150524060 means the unique and multiple mapped reads, but I want to make sure how many unique mapped I have.

So I used:

samtools view -F 0x40 -c input.bam

and I got 75664312

I am not sure if this number above is right.

Could someone please help me with this? Which command should I use to count unique mapped reads? and which to extract in a new bam file just the unique mapped (excluding the multi and unmapped ones)?

Thanks in advance.

Assembly • 4.1k views
ADD COMMENT
0
Entering edit mode

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
0
Entering edit mode

Thanks. I didn't know that.

ADD REPLY
0
Entering edit mode

Hi, all I know for now is that 0x40 is not the correct flag. Take a look at the SAM specification here: https://samtools.github.io/hts-specs/SAMv1.pdf (go to page 5). I don't believe that BWA records anything specific about unique mappings, but it does retain information about primary (best) and secondary alignments. However, I know that it's possible to infer uniquely mapped reads if you align with Bowtie (bowtie 1), using --best and -m 1 command-line parameters. Bowtie2 doesn't have these parameters but it still retains unique mapping information. When you run Bowtie/Bowtie2, the alignment report gives this information. I've just done some mappings right now (see the part in bold):

5157135 (100.00%) were paired; of these:

805804 (15.63%) aligned concordantly 0 times

2102530 (40.77%) aligned concordantly exactly 1 time

2248801 (43.61%) aligned concordantly >1 times

...

86.78% overall alignment rate

ADD REPLY
0
Entering edit mode

Thank you. As I was used to work with Bowtie2, I was trying to get the same info with BWA. But now I see it is different.

ADD REPLY
1
Entering edit mode
6.5 years ago
ATpoint 81k

Here is a tool that can help you handle the bitwise flags. 0x40 stands for 'first in pair', and samtools view -F stands for discarding those reads flagged as such. Your settings do not really make sense here. If unique means that multimappers are to be excluded and only high-quality alignments are to be counted, I would go with:

samtools view -bh -q 30 -f 3 -F 2316 -c in.bam

It keeps (-f) mapped and properly paired reads that are not multimappers (MAPQ=0), discarding (-F) non-primary alignment, unmapped reads and mates. Play around with the flags to fit your needs.

ADD COMMENT
0
Entering edit mode

Thank you. I will try it and let you know how it work for me.

ADD REPLY

Login before adding your answer.

Traffic: 1532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6