HISAT2 (v2.0.0-beta) MAPQ query for allele specific expression analysis
1
0
Entering edit mode
5.6 years ago
prasundutta87 ▴ 640

Hello,

For an allele specific expression analysis, one of the ways to reduce ambiguous mapping reads is to extract or use only unique mapping (not referring to primary alignment or unique alignment here) reads from an alignment file.

For this, highest MAPQ score (column 5 in a BAM file) can be used as a uniquely mapped read will always have the highest mapping quality and also the NH (no. of hits) tag will always be 1 (NH:i:1).

HISAT2 versions 2.0.4 and 2.0.5 have given the uniquely mapping reads a MAPQ of 60 and their NH tag is always 1. Extracting or using uniquely mapping reads from BAM files produced from these versions of HISAT is fairly simple (can use samtools view -q 60 for considering only uniqely mapped reads). If we do not make a new BAM file consisiting of only reads having MAPQ=60 from the original BAM file, we can also instruct downstream tools (such as variant callers or read counter tools) to consider reads having MAPQ=60.

Unfortunately, some BAM files I am dealing with have been produced using HISAT2 (v2.0.0-beta) in which there is a bug where in the NH tag is not properly assigned. To be more specific, the highest MAPQ assigned is 255 (for uniqely mapped read) but the NH tag is not always 1. This bug has been solved v 2.0.4 onwards though.

Can someone help me or guide me on how to extract uniquely mapped reads in this case? Also, if I have understood or explained some concept wrongly, please let me know.

alignment allele specific expression • 2.7k views
0
Entering edit mode

SAM specifications defines MAPQ of 255 as "indicating that the mapping quality is not available". Assuming HISAT2 follows the specifications, MAPQ of 255 does not indicate a uniquely mapped read and does not enforce NH=1.

0
Entering edit mode

Thanks for your reply. But that was the reason the Hisat developers changed it from 255 to 60 from v2.0.4 onwards.

1
Entering edit mode
5.6 years ago

Is it the case that the MAPQ of 255 for "unique" is wrong or that the NH is wrong or both. If the MAPQ of 255 indicating "unique alignment" is correct then just samtools view -q 60 again and call it done. If that's wrong and the NH auxiliary tag is correct then grep -e "^@SQ" -e "NH:i:1 " or something along those lines.

0
Entering edit mode

I am not completely sure. As a trend, the reads with highest mapping quality always has NH:i:1 (mostly applicable to newly developed aligners which report the NH tag). The issue is that some reads which have the score as 255, have NH:i:2 or NH:i:3 as well. Also, some reads which have NH:i:1 have scores 0 or 1 as well.

Worst case scenario-I have to realign using new version of hisat2.

1
Entering edit mode

Given the uncertainty, go ahead and use a newer version rather than wasting more time on this.

0
Entering edit mode

Thanks Devon..that seems to be the only way now..