Question: Random access to BAM files by offset in HTSJDK or Java in general
gravatar for Alexey
12 months ago by
Alexey0 wrote:

I'd like to access the records in a BAM file by their offsets.

I'm using HTSJDK. Suppose I've got a record in a BAM file. I call getFileSource().getFilePointer(). If I cast the returned value to BAMFileSpan I can access the toCoordinateArray method (see JavaDoc) which gives me a pair of numbers that are offsets in the BAM file that define the chunk that contains the record. I think those are the virtual offsets from the SAM/BAM specification, but I'm not sure.

Is there a way to access the record in a BAM file using those offsets in HTSJDK? Failing that, in a different Java library or a library with Java bindings? Failing that, Python will have to do as well. The BAM file is not indexed and it is not sorted, so SamReader's query methods do not apply.

samrecord bam java htsjdk • 446 views
ADD COMMENTlink written 12 months ago by Alexey0

Errr, why not just sort and index the file. Holding all of that in memory would be a real waste of time and RAM.

ADD REPLYlink modified 12 months ago • written 12 months ago by Devon Ryan70k

Because I need random access in a sorting algorithm that I want to try. Using an existing tool to sort would defeat the purpose.

ADD REPLYlink written 12 months ago by Alexey0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1477 users visited in the last hour