Question: extracting mapped reads from the bam file
0
gravatar for KVC_bioinfo
5 months ago by
KVC_bioinfo310
WA, USA
KVC_bioinfo310 wrote:

Hello all,

I am trying to extract the mapped reads from a bam file for a specific region.

samtools view -h in.bam "chr1:regionstart-regionend" > out.bam

this gives me all the reads falling into that region. (which contains read that mapped into that region plus reads unmapped into that region)

I am looking for reads that mapped to that specific region. Which options can I include?

Or samtools in unable to do it?

Thank you in advance.

mapped • 409 views
ADD COMMENTlink modified 5 months ago by Devon Ryan79k • written 5 months ago by KVC_bioinfo310
2

How are you distinguishing between reads that "fall" in a position, from reads that "map" to it?

ADD REPLYlink written 5 months ago by swbarnes23.6k

That is basically I am not able to do it. I am interested in extracting reads mapped in a specific region.

ADD REPLYlink written 5 months ago by KVC_bioinfo310
1

To most of us, there is no difference in what you've done and what you're asking for, so clarification is required. Please elaborate on what the difference is between what you've done and what you're asking for by editing your original post.

ADD REPLYlink written 5 months ago by cpak1981100

Thank you. I added more explanation.

ADD REPLYlink written 5 months ago by KVC_bioinfo310
2

please, stop asking questions and acknowledge the answers (green tick on the left) or comment the answers:

ADD REPLYlink modified 5 months ago • written 5 months ago by Pierre Lindenbaum107k

Hello Pierre,

All the posts you pulled here does not answer the question I asked in this post. I have always acknowledged the answer.

ADD REPLYlink written 5 months ago by KVC_bioinfo310
2

Perhaps you're unaware of how to acknowledge answers. There should be a check mark (in addition to up/down arrows) that you as the OP can toggle on/off.

ADD REPLYlink written 5 months ago by cpak1981100
1

no: eg.: Understanding SAM file format answer is correct but not commented or not flagged as answered.

ADD REPLYlink modified 5 months ago • written 5 months ago by Pierre Lindenbaum107k

Got it. Changed it to answer. Thank you.

ADD REPLYlink written 5 months ago by KVC_bioinfo310
3
gravatar for Pierre Lindenbaum
5 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum107k wrote:

using http://lindenb.github.io/jvarkit/VcfFilterJdk.html after samtools to get the reads strictly in the regions:

samtools view -b -F 4 in.bam "chr1:regionstart-regionend" |\
java -jar dist/vcffilterjdk -e 'return record.getStart()>=regionstart &&  record.getEnd()<=regionend;'
ADD COMMENTlink written 5 months ago by Pierre Lindenbaum107k
1

This works well. Thank you very much.

ADD REPLYlink written 5 months ago by KVC_bioinfo310

Thank you very much. I will try this out.

ADD REPLYlink written 5 months ago by KVC_bioinfo310
1
gravatar for Devon Ryan
5 months ago by
Devon Ryan79k
Freiburg, Germany
Devon Ryan79k wrote:
samtools view -b -F 4 in.bam "chr1:regionstart-regionend" > out.bam

I assume by "unmapped in a region" you mean "having coordinates in a region but also being unmapped". So, just filter unmapped reads.

ADD COMMENTlink written 5 months ago by Devon Ryan79k

Suppose I have a long read which maps to few exons of a gene A and it does not map to rest of the exons of gene A. Now when I extract the reads that map to say exon 1, I want to have only reads that mapped to a region of exon one. When I use the command you gave, that extracts the reads falling in that regions. however, few reads in that region do not map to the region specified but they map somewhere else on the gene.

I could see that on IGV few genes were not mapping to the region I specified but still present in the bam file I got from above command.

ADD REPLYlink written 5 months ago by KVC_bioinfo310
1

few reads in that region do not map to the region specified but they map somewhere else on the gene.

it's not clear if you're talking about a read or about a (paired) read and its' mate.

ADD REPLYlink written 5 months ago by Pierre Lindenbaum107k

my original reads are single end reads coming from nanopore sequencing.

ADD REPLYlink written 5 months ago by KVC_bioinfo310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 913 users visited in the last hour