Question: extracting mapped reads from the bam file
0
gravatar for KVC_bioinfo
11 months ago by
KVC_bioinfo350
Boston
KVC_bioinfo350 wrote:

Hello all,

I am trying to extract the mapped reads from a bam file for a specific region.

samtools view -h in.bam "chr1:regionstart-regionend" > out.bam

this gives me all the reads falling into that region. (which contains read that mapped into that region plus reads unmapped into that region)

I am looking for reads that mapped to that specific region. Which options can I include?

Or samtools in unable to do it?

Thank you in advance.

mapped • 786 views
ADD COMMENTlink modified 11 months ago by Devon Ryan86k • written 11 months ago by KVC_bioinfo350
2

How are you distinguishing between reads that "fall" in a position, from reads that "map" to it?

ADD REPLYlink written 11 months ago by swbarnes24.5k

That is basically I am not able to do it. I am interested in extracting reads mapped in a specific region.

ADD REPLYlink written 11 months ago by KVC_bioinfo350
1

To most of us, there is no difference in what you've done and what you're asking for, so clarification is required. Please elaborate on what the difference is between what you've done and what you're asking for by editing your original post.

ADD REPLYlink written 11 months ago by cpak1981100

Thank you. I added more explanation.

ADD REPLYlink written 11 months ago by KVC_bioinfo350
2

please, stop asking questions and acknowledge the answers (green tick on the left) or comment the answers:

ADD REPLYlink modified 11 months ago • written 11 months ago by Pierre Lindenbaum114k

Hello Pierre,

All the posts you pulled here does not answer the question I asked in this post. I have always acknowledged the answer.

ADD REPLYlink written 11 months ago by KVC_bioinfo350
2

Perhaps you're unaware of how to acknowledge answers. There should be a check mark (in addition to up/down arrows) that you as the OP can toggle on/off.

ADD REPLYlink written 11 months ago by cpak1981100
1

no: eg.: Understanding SAM file format answer is correct but not commented or not flagged as answered.

ADD REPLYlink modified 11 months ago • written 11 months ago by Pierre Lindenbaum114k

Got it. Changed it to answer. Thank you.

ADD REPLYlink written 11 months ago by KVC_bioinfo350
3
gravatar for Pierre Lindenbaum
11 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum114k wrote:

using http://lindenb.github.io/jvarkit/VcfFilterJdk.html after samtools to get the reads strictly in the regions:

samtools view -b -F 4 in.bam "chr1:regionstart-regionend" |\
java -jar dist/vcffilterjdk -e 'return record.getStart()>=regionstart &&  record.getEnd()<=regionend;'
ADD COMMENTlink written 11 months ago by Pierre Lindenbaum114k
1

This works well. Thank you very much.

ADD REPLYlink written 11 months ago by KVC_bioinfo350

Thank you very much. I will try this out.

ADD REPLYlink written 11 months ago by KVC_bioinfo350
1
gravatar for Devon Ryan
11 months ago by
Devon Ryan86k
Freiburg, Germany
Devon Ryan86k wrote:
samtools view -b -F 4 in.bam "chr1:regionstart-regionend" > out.bam

I assume by "unmapped in a region" you mean "having coordinates in a region but also being unmapped". So, just filter unmapped reads.

ADD COMMENTlink written 11 months ago by Devon Ryan86k

Suppose I have a long read which maps to few exons of a gene A and it does not map to rest of the exons of gene A. Now when I extract the reads that map to say exon 1, I want to have only reads that mapped to a region of exon one. When I use the command you gave, that extracts the reads falling in that regions. however, few reads in that region do not map to the region specified but they map somewhere else on the gene.

I could see that on IGV few genes were not mapping to the region I specified but still present in the bam file I got from above command.

ADD REPLYlink written 11 months ago by KVC_bioinfo350
1

few reads in that region do not map to the region specified but they map somewhere else on the gene.

it's not clear if you're talking about a read or about a (paired) read and its' mate.

ADD REPLYlink written 11 months ago by Pierre Lindenbaum114k

my original reads are single end reads coming from nanopore sequencing.

ADD REPLYlink written 11 months ago by KVC_bioinfo350
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1637 users visited in the last hour