Okay, I will be a little more specific.
With the .bai file you can randomly access any position in a BAM file. The goal is to access the region that contains reads unmapped.
samtools view -f 4 input.bam will produce reads unmapped, but if its mate is mapped to a specific chromosome, these reads are "aligned" to a chromosome, i.e. they have a rname and a position.
I want to try and access those reads that do not contain a rname. Yes they will not be aligned (i.e. -f 4) but the set of reads that I want wont include those that have a rname.
samtools idxstats input.bam
chrUn_gl000241 ? ? ?
chrUn_gl000242 ? ? ?
chrUn_gl000243 ? ? ?
chrUn_gl000244 ? ? ?
chrUn_gl000245 ? ? ?
chrUn_gl000246 ? ? ?
chrUn_gl000247 ? ? ?
chrUn_gl000248 ? ? ?
chrUn_gl000249 ? ? ?
* ? ? ? <--------------------------------------I WANT THIS
I want to access the last region (i.e. the reads that do not contain a rname).
samtools view input.bam '*' doesn't work. These are always located at the end of a BAM file, so it seems inefficient to run throught a BAM file to just grab these reads (i.e. I want the random access if possible).
Let me know if this makes sense.
(Additional comments: In a sorted bam file all the paired reads with both read and its pair unmapped will be moved to the bottom of the bam file. The user asked if there is any fast way to access those reads directly without going through the whole bam file. The samtools command "samtools view -f 12 input.bam" takes too long.)