Question: Extract the alignments from a Bam file by name of the read
0
gravatar for 2xijok
27 days ago by
2xijok50
Japan
2xijok50 wrote:

Hello.

I have a question about Bam files. How can I extract the alignment of a specific named read from a Bam file? I did a search for Biostar and and found the following answer

Question: (Closed) Efficiently Extracting Reads With Specific Names ('Queryname') From .Bam File Efficiently Extracting Reads With Specific Names ('Queryname') From .Bam File

samtools view file.bam | grep queryname - > subset.sam

Yes. This is a simple and great answer, but since it's seven years old, I thought there might be a better way today. Is there a more efficient way to extract the Bam files? Especially when they are sorted by name?

(I would like to know how to extract by name and not by location.)

alignment • 130 views
ADD COMMENTlink modified 27 days ago by Jorge Amigo12k • written 27 days ago by 2xijok50
2
gravatar for Pierre Lindenbaum
27 days ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

FilterSamReads https://broadinstitute.github.io/picard/command-line-overview.html#FilterSamReads with READ_LIST_FILE option as I said in C: Extracting Subsets Of Reads From A Bam File

ADD COMMENTlink modified 27 days ago • written 27 days ago by Pierre Lindenbaum131k
1
gravatar for Jorge Amigo
27 days ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

If you are looking for a particular read group, you can filter your bam file directly:

samtools view -br RGTAG file.bam > file.RGTAG.bam

If you're looking for a particular read name, parsing the bam file with grep is still a viable way to do it. You can try to optimize it though, if only 1 read is expected:

samtools view file.bam | grep -m1 queryname - > subset.sam

You can try to optimize it even more, if only 1 read is expected and you more or less know the region (say chr3:1000-2000 as an example) where the read could have been mapped:

samtools view file.bam chr3:1000-2000 | grep -m1 queryname - > subset.sam

If you are able to go for this last option, the results should be generated immediately.

ADD COMMENTlink modified 27 days ago • written 27 days ago by Jorge Amigo12k
2

I would use

grep -F -w -m1
ADD REPLYlink written 27 days ago by Pierre Lindenbaum131k
1

You're right. That should be faster and more precise.

ADD REPLYlink written 27 days ago by Jorge Amigo12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1427 users visited in the last hour