extracting mapped reads from the bam file
2
0
Entering edit mode
4.0 years ago
KVC_bioinfo ▴ 550

Hello all,

I am trying to extract the mapped reads from a bam file for a specific region.

samtools view -h in.bam "chr1:regionstart-regionend" > out.bam

this gives me all the reads falling into that region. (which contains read that mapped into that region plus reads unmapped into that region)

I am looking for reads that mapped to that specific region. Which options can I include?

Or samtools in unable to do it?

Thank you in advance.

mapped • 4.0k views
ADD COMMENT
2
Entering edit mode

How are you distinguishing between reads that "fall" in a position, from reads that "map" to it?

ADD REPLY
0
Entering edit mode

That is basically I am not able to do it. I am interested in extracting reads mapped in a specific region.

ADD REPLY
1
Entering edit mode

To most of us, there is no difference in what you've done and what you're asking for, so clarification is required. Please elaborate on what the difference is between what you've done and what you're asking for by editing your original post.

ADD REPLY
0
Entering edit mode

Thank you. I added more explanation.

ADD REPLY
2
Entering edit mode
ADD REPLY
0
Entering edit mode

Hello Pierre,

All the posts you pulled here does not answer the question I asked in this post. I have always acknowledged the answer.

ADD REPLY
2
Entering edit mode

Perhaps you're unaware of how to acknowledge answers. There should be a check mark (in addition to up/down arrows) that you as the OP can toggle on/off.

ADD REPLY
1
Entering edit mode

no: eg.: Understanding SAM file format answer is correct but not commented or not flagged as answered.

ADD REPLY
0
Entering edit mode

Got it. Changed it to answer. Thank you.

ADD REPLY
3
Entering edit mode
4.0 years ago

using http://lindenb.github.io/jvarkit/VcfFilterJdk.html after samtools to get the reads strictly in the regions:

samtools view -b -F 4 in.bam "chr1:regionstart-regionend" |\
java -jar dist/vcffilterjdk -e 'return record.getStart()>=regionstart &&  record.getEnd()<=regionend;'
ADD COMMENT
1
Entering edit mode

This works well. Thank you very much.

ADD REPLY
0
Entering edit mode

Thank you very much. I will try this out.

ADD REPLY
1
Entering edit mode
4.0 years ago
samtools view -b -F 4 in.bam "chr1:regionstart-regionend" > out.bam

I assume by "unmapped in a region" you mean "having coordinates in a region but also being unmapped". So, just filter unmapped reads.

ADD COMMENT
0
Entering edit mode

Suppose I have a long read which maps to few exons of a gene A and it does not map to rest of the exons of gene A. Now when I extract the reads that map to say exon 1, I want to have only reads that mapped to a region of exon one. When I use the command you gave, that extracts the reads falling in that regions. however, few reads in that region do not map to the region specified but they map somewhere else on the gene.

I could see that on IGV few genes were not mapping to the region I specified but still present in the bam file I got from above command.

ADD REPLY
1
Entering edit mode

few reads in that region do not map to the region specified but they map somewhere else on the gene.

it's not clear if you're talking about a read or about a (paired) read and its' mate.

ADD REPLY
0
Entering edit mode

my original reads are single end reads coming from nanopore sequencing.

ADD REPLY

Login before adding your answer.

Traffic: 2623 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6