Question: Reduce Sam/Bam File Size
0
gravatar for Nicolas Rosewick
7.3 years ago by
Belgium, Brussels
Nicolas Rosewick7.7k wrote:

Hi,

I've a little question on sam and bam file sizes.

When I use bwa on paired-end reads (~50M reads) on a small reference sequence (~100 kb) , I've a bam file of about 5 Go . After looking the alignment, only a few reads aligned on this reference (~500 reads max)

But When I use tophat with the same input and the same reference, the output bam has a size of only 10 kb and the number of aligned reads is the same...

So is it a way to reduce my bam file ?

Thanks

bam sam • 4.1k views
ADD COMMENTlink written 7.3 years ago by Nicolas Rosewick7.7k
3
gravatar for Sean Davis
7.3 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

Just guessing, but your first BAM file probably contains both aligned and unaligned reads. The Tophat-produced BAM file contains only aligned reads. Both are "correct" BAM files, but which is most useful will depend on your particular needs. If you decide that you do not need to have the unaligned reads, you can use samtools view with a flag filter to remove reads that are unmapped.

ADD COMMENTlink written 7.3 years ago by Sean Davis25k

Is it ok like this : samtools view -F 4 in.bam > out.bam

ADD REPLYlink written 7.3 years ago by Nicolas Rosewick7.7k

You'll probably also want to include -b and -h for bam output and SAM header, respectively.

ADD REPLYlink written 7.3 years ago by Sean Davis25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1006 users visited in the last hour