Question: BWA mem how to know if a read is mapped uniquely?
1
gravatar for steven.lin.pur
22 months ago by
Taiwan
steven.lin.pur10 wrote:

I have read many posts regarding this question. The XT, XA tag doesn't seem to be used anymore by BWA mem, so I can't use them to identify uniquely mapped reads.

Some posts said that multiple hits are to be assigned a quality score of 0. However, I don't have any links to confirm if this is still true. Does anyone know where I can confirm this?

Does anyone know how to tell if a read is uniquely map from the result of BWA-mem alignment? Thank you.

bwa alignment • 4.0k views
ADD COMMENTlink modified 22 months ago by Lemire/OICR350 • written 22 months ago by steven.lin.pur10
3
gravatar for Matt Shirley
22 months ago by
Matt Shirley7.9k
Cambridge, MA
Matt Shirley7.9k wrote:

You can filter out the secondary alignments in your file by excluding reads with the "secondary" flag (0x100 binary or 256 decimal) set. You can use samtools for this:

samtools view -F 256 file.bam
ADD COMMENTlink written 22 months ago by Matt Shirley7.9k

If you want to ask for reads that have no secondary alignments, then you might need a script that runs through a read-name sorted file and looks for singleton read names.

 

ADD REPLYlink written 22 months ago by Matt Shirley7.9k

Thank you, I'll try that.

I'm just wondering why hasn't there been a tool to perform such a task. This seems to be a common task when handling bam/sam files. 

ADD REPLYlink written 22 months ago by steven.lin.pur10

N.B., this isn't a common task. You'd be well advised to drop the concept of a "unique alignment" and just filter by MAPQ (and the flags that Matt Shirley indicated).

ADD REPLYlink written 22 months ago by Devon Ryan71k
1

I am with the OP though that one should be able to select uniquely mapping reads. Even considering that I also agree that most people that want to make use of these "uniquely mapping reads" may not fully understand all the implications of it. 

After all the concept of secondary alignment is not any better defined than that of a unique read. What exactly is a secondary alignment? At what point does an alignment become secondary and under what conditions won't get showns at all? FWIW I only have a superficial understanding of how it all works.

As for the original question I believe that bwa mem will produce a mapping quality of 0 if a read aligns in multiple locations. It used to work that way for a long time, even though technically there is no requirement or standard that would require that, it is a bwa specific convention. To filter for uniquely mapped reads pass the -q 1 filtering flag to samtools view.

ADD REPLYlink modified 22 months ago • written 22 months ago by Istvan Albert ♦♦ 74k

It boils down to which definition of "uniquely mapping" you want to go by. I saw one on the samtools-help email list in the last week that would have meant filtering with samtools view -q 40.

ADD REPLYlink written 22 months ago by Devon Ryan71k
0
gravatar for Lemire/OICR
22 months ago by
Lemire/OICR350
Canada
Lemire/OICR350 wrote:

I am using version 0.7.12 and bwa mem reports the XA tag (fake data):


% bwa mem fake.fa  fake.fq   
[M::bwa_idx_load_from_disk] read 0 ALT contigs
@SQ     SN:fakechr1     LN:1500
@SQ     SN:fakechr2     LN:1500
@PG     ID:bwa  PN:bwa  VN:0.7.12-r1039 CL:bwa mem fake.fa fake.fq
[M::process] read 1 sequences (126 bp)...
[M::mem_process_seqs] Processed 1 reads in 0.000 CPU sec, 0.000 real sec
HISEQ-MFG:110:C75JDANXX:2:1101:13263:2218       0       fakechr2        251     0       126M    *       0       0       ATGGGTCATCCCCTTCACTCCCAGCTCAGAGCCCAGGCCAGGGGCCCCCAAGAAAGGCTCTGGTGGAGAACCTGTGCATGAAGGCTGTCAACCAGTCCATAGGCAAGCCTGGCTGCCTCCAGCTGG      #==BBGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGBGGGGGGGGGG      NM:i:0  MD:Z:126        AS:i:126   XS:i:126 XA:Z:fakechr1,+251,126M,0;
[main] Version: 0.7.12-r1039
[main] CMD: bwa mem fake.fa fake.fq
[main] Real time: 0.004 sec; CPU: 0.000 sec

 

ADD COMMENTlink modified 22 months ago • written 22 months ago by Lemire/OICR350

Sorry, it turns out that BWA does still give XA tag..... I was reading the latest SAM file spec and I can't find the XA tag anymore. I thought this means that the XA tag isn't available anymore.

I'm a bit confused about the output though. If there is an XA tag, doesn't mean that it is a non-unique hit and should have a XT:A:R tag as well? Also can you please tell me if you can find any 256 flag in the bam file?

ADD REPLYlink written 22 months ago by steven.lin.pur10
1

X, Y and Z tags are not specified by the SAM spec, these tags are reserved for the aligners to define them. See the bwa manual, it has description for XA etc.

http://bio-bwa.sourceforge.net/bwa.shtml

as for what tags are filled in and what are not that again is at the discretion of the aligner, there is no requirement for what should be filled and what should not

ADD REPLYlink written 22 months ago by Istvan Albert ♦♦ 74k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1401 users visited in the last hour