I'd like to access the records in a BAM file by their offsets.
I'm using HTSJDK. Suppose I've got a record in a BAM file. I call getFileSource().getFilePointer()
. If I cast the returned value to BAMFileSpan
I can access the toCoordinateArray
method (see JavaDoc) which gives me a pair of numbers that are offsets in the BAM file that define the chunk that contains the record. I think those are the virtual offsets from the SAM/BAM specification, but I'm not sure.
Is there a way to access the record in a BAM file using those offsets in HTSJDK? Failing that, in a different Java library or a library with Java bindings? Failing that, Python will have to do as well. The BAM file is not indexed and it is not sorted, so SamReader's query methods do not apply.
Errr, why not just sort and index the file. Holding all of that in memory would be a real waste of time and RAM.
Because I need random access in a sorting algorithm that I want to try. Using an existing tool to sort would defeat the purpose.