Question: BWA mem how to know if a read is mapped uniquely?
gravatar for steven.lin.pur
4.5 years ago by
steven.lin.pur10 wrote:

I have read many posts regarding this question. The XT, XA tag doesn't seem to be used anymore by BWA mem, so I can't use them to identify uniquely mapped reads.

Some posts said that multiple hits are to be assigned a quality score of 0. However, I don't have any links to confirm if this is still true. Does anyone know where I can confirm this?

Does anyone know how to tell if a read is uniquely map from the result of BWA-mem alignment? Thank you.

bwa alignment • 9.3k views
ADD COMMENTlink modified 4.5 years ago by Lemire570 • written 4.5 years ago by steven.lin.pur10
gravatar for Matt Shirley
4.5 years ago by
Matt Shirley9.3k
Cambridge, MA
Matt Shirley9.3k wrote:

You can filter out the secondary alignments in your file by excluding reads with the "secondary" flag (0x100 binary or 256 decimal) set. You can use samtools for this:

samtools view -F 256 file.bam
ADD COMMENTlink written 4.5 years ago by Matt Shirley9.3k

If you want to ask for reads that have no secondary alignments, then you might need a script that runs through a read-name sorted file and looks for singleton read names.

ADD REPLYlink modified 5 months ago by RamRS27k • written 4.5 years ago by Matt Shirley9.3k

Thank you, I'll try that.

I'm just wondering why hasn't there been a tool to perform such a task. This seems to be a common task when handling bam/sam files.

ADD REPLYlink modified 5 months ago by RamRS27k • written 4.5 years ago by steven.lin.pur10

N.B., this isn't a common task. You'd be well advised to drop the concept of a "unique alignment" and just filter by MAPQ (and the flags that Matt Shirley indicated).

ADD REPLYlink written 4.5 years ago by Devon Ryan95k

I am with the OP though that one should be able to select uniquely mapping reads. Even considering that I also agree that most people that want to make use of these "uniquely mapping reads" may not fully understand all the implications of it.

After all the concept of secondary alignment is not any better defined than that of a unique read. What exactly is a secondary alignment? At what point does an alignment become secondary and under what conditions won't get showns at all? FWIW I only have a superficial understanding of how it all works.

As for the original question I believe that bwa mem will produce a mapping quality of 0 if a read aligns in multiple locations. It used to work that way for a long time, even though technically there is no requirement or standard that would require that, it is a bwa specific convention. To filter for uniquely mapped reads pass the -q 1 filtering flag to samtools view.

ADD REPLYlink modified 5 months ago by RamRS27k • written 4.5 years ago by Istvan Albert ♦♦ 84k

It boils down to which definition of "uniquely mapping" you want to go by. I saw one on the samtools-help email list in the last week that would have meant filtering with samtools view -q 40.

ADD REPLYlink written 4.5 years ago by Devon Ryan95k

This won't work with output from bwa because it (always?) sticks secondary alignments into the XA tag of the primary alignment instead of recording them as secondary alignments in the samfile.

ADD REPLYlink written 11 months ago by markamery0
gravatar for Lemire
4.5 years ago by
Lemire570 wrote:

I am using version 0.7.12 and bwa mem reports the XA tag (fake data):

% bwa mem fake.fa  fake.fq   
[M::bwa_idx_load_from_disk] read 0 ALT contigs
@SQ     SN:fakechr1     LN:1500
@SQ     SN:fakechr2     LN:1500
@PG     ID:bwa  PN:bwa  VN:0.7.12-r1039 CL:bwa mem fake.fa fake.fq
[M::process] read 1 sequences (126 bp)...
[M::mem_process_seqs] Processed 1 reads in 0.000 CPU sec, 0.000 real sec
[main] Version: 0.7.12-r1039
[main] CMD: bwa mem fake.fa fake.fq
[main] Real time: 0.004 sec; CPU: 0.000 sec
ADD COMMENTlink modified 5 months ago by RamRS27k • written 4.5 years ago by Lemire570

Sorry, it turns out that BWA does still give XA tag..... I was reading the latest SAM file spec and I can't find the XA tag anymore. I thought this means that the XA tag isn't available anymore.

I'm a bit confused about the output though. If there is an XA tag, doesn't mean that it is a non-unique hit and should have a XT:A:R tag as well? Also can you please tell me if you can find any 256 flag in the bam file?

ADD REPLYlink written 4.5 years ago by steven.lin.pur10

X, Y and Z tags are not specified by the SAM spec, these tags are reserved for the aligners to define them. See the bwa manual, it has description for XA etc.

as for what tags are filled in and what are not that again is at the discretion of the aligner, there is no requirement for what should be filled and what should not

ADD REPLYlink written 4.5 years ago by Istvan Albert ♦♦ 84k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2354 users visited in the last hour