I'm new to htsjdk and bam file so I would really appreciate your help to get started on this. My first goal is simple. I want to use a .bai to random access a position in .bam given chromosome and position. Then I want to count the number of reads for ref allele, and the number of reads for alt allele that have MapQuality above certain value at that position.
I am using HTSJDK, but I found it difficult to find exactly what I need from its long documentation.
So far I've only been able to open the file, thanks to Pierre Lindenbaum's previous post:
SamReaderFactory srf=SamReaderFactory.make();
srf.validationStringency(ValidationStringency.LENIENT);
SAMFileReader samReader=new SAMFileReader(vsFile, new File(bai));
//I know SAMFileReader is deprecated, but I dont know what's the new equivalent
AbstractBAMFileIndex index=(AbstractBAMFileIndex) samReader.getIndex();
SAMRecordIterator iter=samReader.iterator();
Even that's not really what I want, since I need to be able to random access it.
So yea, didn't go very far, any help is much appreciated!!!
Thank you Mr. Lindenbaum for your response.
But in my understanding, iterator doesn't use a bai file to access exactly where the position is, instead it loops till it finds the position matching the interval. Is there a way to use a .bai file to random access a bam file location?
you're wrong.
I see, but I'm still confused, how it finds the interval without the input of a bai file. Would you help me make sense of it?
The bai is loaded if needed.
https://github.com/samtools/htsjdk/blob/master/src/main/java/htsjdk/samtools/BAMFileReader.java#L748
calls
https://github.com/samtools/htsjdk/blob/master/src/main/java/htsjdk/samtools/BAMFileReader.java#L261
calls
https://github.com/samtools/htsjdk/blob/master/src/main/java/htsjdk/samtools/BAMFileReader.java#L267
why don't you just try ?
I see! It does go find the index file if it's named by convention. Thanks!