Question: extracting mapped reads from the bam file
0
gravatar for KVC_bioinfo
16 months ago by
KVC_bioinfo380
Boston
KVC_bioinfo380 wrote:

Hello all,

I am trying to extract the mapped reads from a bam file for a specific region.

samtools view -h in.bam "chr1:regionstart-regionend" > out.bam

this gives me all the reads falling into that region. (which contains read that mapped into that region plus reads unmapped into that region)

I am looking for reads that mapped to that specific region. Which options can I include?

Or samtools in unable to do it?

Thank you in advance.

mapped • 1.2k views
ADD COMMENTlink modified 16 months ago by Devon Ryan89k • written 16 months ago by KVC_bioinfo380
2

How are you distinguishing between reads that "fall" in a position, from reads that "map" to it?

ADD REPLYlink written 16 months ago by swbarnes25.2k

That is basically I am not able to do it. I am interested in extracting reads mapped in a specific region.

ADD REPLYlink written 16 months ago by KVC_bioinfo380
1

To most of us, there is no difference in what you've done and what you're asking for, so clarification is required. Please elaborate on what the difference is between what you've done and what you're asking for by editing your original post.

ADD REPLYlink written 16 months ago by cpak1981100

Thank you. I added more explanation.

ADD REPLYlink written 16 months ago by KVC_bioinfo380
2

please, stop asking questions and acknowledge the answers (green tick on the left) or comment the answers:

ADD REPLYlink modified 16 months ago • written 16 months ago by Pierre Lindenbaum119k

Hello Pierre,

All the posts you pulled here does not answer the question I asked in this post. I have always acknowledged the answer.

ADD REPLYlink written 16 months ago by KVC_bioinfo380
2

Perhaps you're unaware of how to acknowledge answers. There should be a check mark (in addition to up/down arrows) that you as the OP can toggle on/off.

ADD REPLYlink written 16 months ago by cpak1981100
1

no: eg.: Understanding SAM file format answer is correct but not commented or not flagged as answered.

ADD REPLYlink modified 16 months ago • written 16 months ago by Pierre Lindenbaum119k

Got it. Changed it to answer. Thank you.

ADD REPLYlink written 16 months ago by KVC_bioinfo380
3
gravatar for Pierre Lindenbaum
16 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

using http://lindenb.github.io/jvarkit/VcfFilterJdk.html after samtools to get the reads strictly in the regions:

samtools view -b -F 4 in.bam "chr1:regionstart-regionend" |\
java -jar dist/vcffilterjdk -e 'return record.getStart()>=regionstart &&  record.getEnd()<=regionend;'
ADD COMMENTlink written 16 months ago by Pierre Lindenbaum119k
1

This works well. Thank you very much.

ADD REPLYlink written 16 months ago by KVC_bioinfo380

Thank you very much. I will try this out.

ADD REPLYlink written 16 months ago by KVC_bioinfo380
1
gravatar for Devon Ryan
16 months ago by
Devon Ryan89k
Freiburg, Germany
Devon Ryan89k wrote:
samtools view -b -F 4 in.bam "chr1:regionstart-regionend" > out.bam

I assume by "unmapped in a region" you mean "having coordinates in a region but also being unmapped". So, just filter unmapped reads.

ADD COMMENTlink written 16 months ago by Devon Ryan89k

Suppose I have a long read which maps to few exons of a gene A and it does not map to rest of the exons of gene A. Now when I extract the reads that map to say exon 1, I want to have only reads that mapped to a region of exon one. When I use the command you gave, that extracts the reads falling in that regions. however, few reads in that region do not map to the region specified but they map somewhere else on the gene.

I could see that on IGV few genes were not mapping to the region I specified but still present in the bam file I got from above command.

ADD REPLYlink written 16 months ago by KVC_bioinfo380
1

few reads in that region do not map to the region specified but they map somewhere else on the gene.

it's not clear if you're talking about a read or about a (paired) read and its' mate.

ADD REPLYlink written 16 months ago by Pierre Lindenbaum119k

my original reads are single end reads coming from nanopore sequencing.

ADD REPLYlink written 16 months ago by KVC_bioinfo380
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1329 users visited in the last hour