Question: extracting mapped reads from the bam file
0
gravatar for KVC_bioinfo
22 months ago by
KVC_bioinfo410
Boston
KVC_bioinfo410 wrote:

Hello all,

I am trying to extract the mapped reads from a bam file for a specific region.

samtools view -h in.bam "chr1:regionstart-regionend" > out.bam

this gives me all the reads falling into that region. (which contains read that mapped into that region plus reads unmapped into that region)

I am looking for reads that mapped to that specific region. Which options can I include?

Or samtools in unable to do it?

Thank you in advance.

mapped • 1.7k views
ADD COMMENTlink modified 22 months ago by Devon Ryan92k • written 22 months ago by KVC_bioinfo410
2

How are you distinguishing between reads that "fall" in a position, from reads that "map" to it?

ADD REPLYlink written 22 months ago by swbarnes26.7k

That is basically I am not able to do it. I am interested in extracting reads mapped in a specific region.

ADD REPLYlink written 22 months ago by KVC_bioinfo410
1

To most of us, there is no difference in what you've done and what you're asking for, so clarification is required. Please elaborate on what the difference is between what you've done and what you're asking for by editing your original post.

ADD REPLYlink written 22 months ago by cpak1981100

Thank you. I added more explanation.

ADD REPLYlink written 22 months ago by KVC_bioinfo410
2

please, stop asking questions and acknowledge the answers (green tick on the left) or comment the answers:

ADD REPLYlink modified 22 months ago • written 22 months ago by Pierre Lindenbaum123k

Hello Pierre,

All the posts you pulled here does not answer the question I asked in this post. I have always acknowledged the answer.

ADD REPLYlink written 22 months ago by KVC_bioinfo410
2

Perhaps you're unaware of how to acknowledge answers. There should be a check mark (in addition to up/down arrows) that you as the OP can toggle on/off.

ADD REPLYlink written 22 months ago by cpak1981100
1

no: eg.: Understanding SAM file format answer is correct but not commented or not flagged as answered.

ADD REPLYlink modified 22 months ago • written 22 months ago by Pierre Lindenbaum123k

Got it. Changed it to answer. Thank you.

ADD REPLYlink written 22 months ago by KVC_bioinfo410
3
gravatar for Pierre Lindenbaum
22 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum123k wrote:

using http://lindenb.github.io/jvarkit/VcfFilterJdk.html after samtools to get the reads strictly in the regions:

samtools view -b -F 4 in.bam "chr1:regionstart-regionend" |\
java -jar dist/vcffilterjdk -e 'return record.getStart()>=regionstart &&  record.getEnd()<=regionend;'
ADD COMMENTlink written 22 months ago by Pierre Lindenbaum123k
1

This works well. Thank you very much.

ADD REPLYlink written 22 months ago by KVC_bioinfo410

Thank you very much. I will try this out.

ADD REPLYlink written 22 months ago by KVC_bioinfo410
1
gravatar for Devon Ryan
22 months ago by
Devon Ryan92k
Freiburg, Germany
Devon Ryan92k wrote:
samtools view -b -F 4 in.bam "chr1:regionstart-regionend" > out.bam

I assume by "unmapped in a region" you mean "having coordinates in a region but also being unmapped". So, just filter unmapped reads.

ADD COMMENTlink written 22 months ago by Devon Ryan92k

Suppose I have a long read which maps to few exons of a gene A and it does not map to rest of the exons of gene A. Now when I extract the reads that map to say exon 1, I want to have only reads that mapped to a region of exon one. When I use the command you gave, that extracts the reads falling in that regions. however, few reads in that region do not map to the region specified but they map somewhere else on the gene.

I could see that on IGV few genes were not mapping to the region I specified but still present in the bam file I got from above command.

ADD REPLYlink written 22 months ago by KVC_bioinfo410
1

few reads in that region do not map to the region specified but they map somewhere else on the gene.

it's not clear if you're talking about a read or about a (paired) read and its' mate.

ADD REPLYlink written 22 months ago by Pierre Lindenbaum123k

my original reads are single end reads coming from nanopore sequencing.

ADD REPLYlink written 22 months ago by KVC_bioinfo410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1450 users visited in the last hour