Bwa: "Xt:A:U" And Mapq Of 0 At The Same Time
1
6
Entering edit mode
10.3 years ago
Steffi ▴ 570

I map RNA-Seq Data with BWA to the genome. The output files from BWA in sam-format contain reads that have on the one hand the tag "XT:A:U" and, on the other hand, as well a mapping quality of 0.

What does this mean? I thought that "XT:A:U" means uniquely best hit?! How does this then go together with a MAPQ of 0?

Best, Stefanie

bwa mapping • 11k views
0
Entering edit mode

I don't think BWA calculates MAPQ at all, use the ""XT:A:U" for fetching uniquely mapped reads, more over you probably noticed that reads which failed mapping "have" MAPQ 0.

0
Entering edit mode

There are also other MAPQ values, like 10, 13, 17, ... . So something is calculated..

0
Entering edit mode

So, as far as I understood, the MAPQ value is also 0 if there are other possible alignments - even with a lower score. So a read might have a "XT:A:U" score but at the same time a MAPQ of 0 - meaning that there are many other possible alignments with a slightly worse score.

1
Entering edit mode
10.2 years ago

From the bwa manual page

Note that XO and XG are generated by BWT search while the CIGAR string by Smith-Waterman alignment. These two tags may be inconsistent with the CIGAR string. This is not a bug.

So, I assume BWA gives a read a uniquely aligned tag but the probability that its aligned correctly is very low. This might be a case of allowing mismatches, it was not able to map earlier but with a allowed number of mistmatches, it could uniquely be mapped at a certain position with very high error rate. When I filter my data, I use the mapq threshold of 1, so that I have uniquely aligned as well has not the worst quality.

You can use samtools view -bq 1 file.bam > file_unique.bam for this.

Someone observed the same scenario posted here in the case of paired-end sequencing data.

Cheers

0
Entering edit mode

Do you by any chance know how to get easily only those that have mapping quality of zero? Or alternatively how to subtract one bam file from another? :) I am now parsing SAM file and filtering it based on MAPQ column, but I'd rather use some tool for this.

1
Entering edit mode

For the first question, even if there is some tool, I think you won't gain the speed, grep or awk would be best. Subtracting one bam from another is a different thing, you can use bedtools (subtractBed) for that or try bamtools, filter might work work for you :)

Available bamtools commands:
convert         Converts between BAM and a number of other formats
count           Prints number of alignments in BAM file(s)
coverage        Prints coverage statistics from the input BAM file
filter          Filters BAM file(s) by user-specified criteria
index           Generates index for BAM file
merge           Merge multiple BAM files into single file
random          Select random alignments from existing BAM file(s), intended more as a testing tool.
resolve         Resolves paired-end reads (marking the IsProperPair flag as needed)
revert          Removes duplicate marks and restores original base qualities
sort            Sorts the BAM file according to some criteria
split           Splits a BAM file on user-specified property, creating a new BAM output file for each value found
stats           Prints some basic statistics from input BAM file(s)