Off topic:Efficiently Extracting Reads With Specific Names ('Queryname') From .Bam File
1
2
Entering edit mode
9.1 years ago
Isaac Joseph ▴ 120

Greetings all, The problem before me is as follows:

I have a pretty large .bam file, and from that file I need to find all mapping locations of a particular read name ("queryname" according to .bam lingo). Is there any way to do so efficiently? picard offers "FilterSamReads.jar", but this method is actually even slower than converting the .bam file to a .sam file and just using grep to extract reads with specific names.

In particular, one would imagine that one could take advantage of sorting the .bam file by queryname (using samtools sort -n) to do this efficiently in a similar manner to which one can extract all mappings to a particular reference in coordinate-sorted .bam files (produced by "samtools sort" within the -n option).

So, the main purpose of writing this is to verify that no efficient method actually exists before spending the time writing a new one.

Cheers.

mapping sam bam picard samtools • 19k views