I have dockerized app running IGV.js and serving files from backend. In casual alignment files from shotgun sequencing (illumina etc.) I can see that my server correctly responds to range requests sent by IGV:
Content-Range: bytes 1969716952-1969767576/2396323876 Content-Type: application/octet-stream
This works okay and my servers responds with
206 HTTP status code (partial response) for almost all bam-file/fasta-ref pairs. The resulting payloads are always around
12-50kB. This is the expected behavior as it avoids IGV having to download the whole bam/fasta.
Given: A reference file around 4kB (reverse transcripted RNA) with hundreds of reads (very high depth).
In certain cases where bam files belong to Oxford Nanopore generated data, IGV.js will request the entire bam file, whether the bam file is 10Mb or 700Mb. The range request from the browser (initiated by IGV.js) looks like this:
And of course the server has no choice but send almost the entire file:
Content-Range: bytes 270-7327614/7612045
This basically requests the entire file and will cause the browser to freeze in cases where the bam file size 700Mb, until the entire bam is read into memory. It was difficult to understand at the beginning why this behavior occurs, then I inspected the index
(.bai) files and realized regardless of bam file size, the generated
*.bai file is always
96bytes. This does not occur in normal
bam/fasta pairs, where the
*.bai files are generally around
I am guessing that above is the reason why IGV.js is requesting the entire file for ONT datasets. Why are these *.bai files always 96b and how can I fix it?
WHAT HAS BEEN TRIED
visibility Window option for IGV.js to 50 base pairs so that IGV does not request anything until your viewport is small enough did not work. Even in case of 50bp, flanking regions are not requested, which means even if you scroll a little bit to the left/right, IGV will re-request the entire file every time.
(Below are pseudo descriptions, I did not run them literally as they are in bash)
SAMTools Mappings Sorter .mapped.bam > .mapped.sorted.bam
SAMTools Mappings Indexer .mapped.sorted.bam > mapped.sorted.bam.bai -->
Minimap2 Aligner for Long Reads .fastq.gz > .mapped.bam + .mmi + .mapped.bam.bai -->
In both cases above, the generated
bai files were always 96b. This file size did not vary based on the bam file size.
I do not own the data, and I am not authorized to share it.