Question: How to generate a .sam/.bam file of a particular chromosome or even a particular region?
1
gravatar for Chenglin
3.7 years ago by
Chenglin90
United States
Chenglin90 wrote:

I wanted to see  a particular sequence in our re-sequencing data using IGV. However the .bam file is too big (10Gb) to handle. Since I am interested in only a small region in the genome, I wondering if there is any way to generate  a  .sam/.bam file only for that region. Thank you very much in advance!

Chenglin

alignment next-gen genome • 5.3k views
ADD COMMENTlink modified 3 months ago by gsr999970 • written 3.7 years ago by Chenglin90
4
gravatar for Ashutosh Pandey
3.7 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

This has been answered many times before. You can specify a small region of your interest and create a small bam file as shown.  

samtools view -bh chr1:100-200 > small.bam 

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by Ashutosh Pandey11k
2

FYI, -b will always write the header if it's present (this isn't documented, but see here), so samtools view -b chr1:100-200 > small.bam will do the same thing.

ADD REPLYlink written 3.7 years ago by Devon Ryan81k

Thanks Devon. Didn't know that. 

ADD REPLYlink written 3.7 years ago by Ashutosh Pandey11k

Thank you very much for your answer. Do  I need to put the reference genome index file (My_genome.fasta.fai) in the command?   Actually I typed  something like this "samtools view -bh chr1:100-200 My_genome.fasta.fai  small.sam > small.bam", but I got error information "[main_samview] fail to open "My_genome.fasta.fai" for reading."  My index file is there, why can't it be read?Thanks!

ADD REPLYlink written 3.7 years ago by Chenglin90

You won't need the fai file at all.

The syntax is:

samtools view [OPTIONS] sorted_and_indexed.bam region

Edit: Just to make things clearer, in your case the command would be samtools view -b small.bam chr1:100-200 > small.subset.bam. The input must be a sorted and indexed BAM or CRAM file.

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by Devon Ryan81k

Thank you very much for your help. I tried and made it.

ADD REPLYlink written 3.7 years ago by Chenglin90

Thank you, Ashutosh. I just made it.

ADD REPLYlink written 3.7 years ago by Chenglin90
1
gravatar for gsr9999
3 months ago by
gsr999970
United States
gsr999970 wrote:

You can also use sambamba tool to slice the existing bam to specific regions. It is much faster than samtools.

Example : $sambamba slice -o my_output_bam_file.bam Input_bam_file.bam chr1:100-200

Or you can extract reads for an entire chromosome by just specifying the chromosome name in option

Example : $sambamba slice -o chr1_bam_file.bam Input_bam_file.bam chr1

ADD COMMENTlink written 3 months ago by gsr999970
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1432 users visited in the last hour