Is there a standard way to index a BAM Index (BAI)?

2

Entering edit mode

10.7 years ago

danvdk ▴ 80

A BAM index (BAI) file lets you map loci on the genome to a range of byte offsets in a BAM file. It's essential for browsing large pileups in interactive visualizations like IGV and BioDalliance.

But at some point, the BAM file grows so large that its corresponding BAI file also gets unwieldy. For example, I have an 80GB BAM file with a 9MB BAI file. Loading this file over even a relatively fast network takes many seconds, far longer than users of modern web pages are accustomed waiting.

One solution to this problem would be to only load portions of the BAI file. For example, if I'm looking at chr20, there's no need to download the portions of the BAI file that deal with the other chromosomes. The BAI format doesn't lend itself well to random seeking, however, so this would require some kind of index.

Is there a standard way to index a BAM Index file?

bai alignment bam • 5.1k views

ADD COMMENT • link updated 4.3 years ago by Ram 45k • written 10.7 years ago by danvdk ▴ 80

1

Entering edit mode

I wound up implementing a BAI indexer in Python (bai-indexer) and added support for this to BioDalliance.

ADD REPLY • link updated 4.3 years ago by Ram 45k • written 10.7 years ago by danvdk ▴ 80

0

Entering edit mode

There's not, though that's probably not a bad idea. You might propose something on the samtools devel email list.

ADD REPLY • link updated 4.3 years ago by Ram 45k • written 10.7 years ago by Devon Ryan 105k

2

Entering edit mode

Just making the observation that when we need to index the index file something is evolving the wrong way

like the meme goes ... I've indexed your index file so you can be indexing while indexing ...

ADD REPLY • link updated 4.3 years ago by Ram 45k • written 10.7 years ago by Istvan Albert 102k

0

Entering edit mode

Perhaps the answer is the CRAM file format that was added to samtools:

CRAM goes mainline

ADD REPLY • link updated 4.3 years ago by Ram 45k • written 10.7 years ago by Istvan Albert 102k

Login before adding your answer.