Multi-mapping read adjacency in alignment tool outputs
0
0
Entering edit mode
2.5 years ago
Alex • 0

Hello friends,

I often work with SAM/BAM formatted short read alignments in a context where it is beneficial to know all alignments for each read rather than best/primary/secondary. My question:

Among short read alignment tools and their configurations, is there ever a case where alignments for multi-mapping reads would be reported in non-adjacent order? I am aware that some tools support sorting outputs by coordinate, but I would expect this to be easily determined from an @HD SO coordinate header. I am more concerned with default unsorted outputs.

The reason I ask is that during read quantification I need to know how many alignments were produced for each multi-mapping read. This is cheap and easy to determine when the alignments are listed adjacent to one another. Intuitively it would make sense for this to be the case. However, from the documentation I've read only STAR's explicitly mentions adjacency. I primarily use bowtie but I want my scripts to be able to support other short read alignment tools like HISAT2, Bowtie 2, and BWA.

I'm aware of the NH:i field, but it isn't reported by all tools (for example, bowtie instead reports XM:i which has other complications, and I believe X0 + X1 is one approach with BWA). The closest guarantee for adjacency seems to be @HD headers reporting SO queryname or GO query, but I would prefer to skip samtools sort -n if I can assume adjacency for files that don't report SO coordinate or GO reference.

aligners multi-mapping alignments • 455 views
ADD COMMENT

Login before adding your answer.

Traffic: 8979 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6