Extract the alignments from a Bam file by name of the read
2
0
Entering edit mode
3.6 years ago
kojix2 ▴ 250

Hello.

I have a question about Bam files. How can I extract the alignment of a specific named read from a Bam file? I did a search for Biostar and and found the following answer

Question: (Closed) Efficiently Extracting Reads With Specific Names ('Queryname') From .Bam File Efficiently Extracting Reads With Specific Names ('Queryname') From .Bam File

samtools view file.bam | grep queryname - > subset.sam

Yes. This is a simple and great answer, but since it's seven years old, I thought there might be a better way today. Is there a more efficient way to extract the Bam files? Especially when they are sorted by name?

(I would like to know how to extract by name and not by location.)

alignment • 7.4k views
ADD COMMENT
2
Entering edit mode
ADD COMMENT
2
Entering edit mode
3.6 years ago

If you are looking for a particular read group, you can filter your bam file directly:

samtools view -br RGTAG file.bam > file.RGTAG.bam

If you're looking for a particular read name, parsing the bam file with grep is still a viable way to do it. You can try to optimize it though, if only 1 read is expected:

samtools view file.bam | grep -m1 queryname - > subset.sam

You can try to optimize it even more, if only 1 read is expected and you more or less know the region (say chr3:1000-2000 as an example) where the read could have been mapped:

samtools view file.bam chr3:1000-2000 | grep -m1 queryname - > subset.sam

If you are able to go for this last option, the results should be generated immediately.

ADD COMMENT
2
Entering edit mode

I would use

grep -F -w -m1
ADD REPLY
1
Entering edit mode

You're right. That should be faster and more precise.

ADD REPLY

Login before adding your answer.

Traffic: 2548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6