Question: BAM File size increased after extracting unique reads
gravatar for ilovesuperheroes1993
6 months ago by
ilovesuperheroes19930 wrote:

Hi, I had used STAR aligner for mapping my reads, and the output BAM files were sorted by coordinate. I used the follwing command to extract unique reads from my bam files:

samtools view -q 255 input_file.bam > unique_reads.bam

(SAM Flag 255 corresponds to unique alignments in STAR)

However, the sizes of my new bam files have increased several-fold. (For example a bam file that was originally 500 mb-900 mb have now become 2.5 gb) This has happened for all the samples.

When I am checking the number of lines in the bam files (the old one and the ones containing the unique reads), it shows that the old file (of size say 500 mb has 44 million lines) while the new file (say size 2 gb has 17 million lines). The number of lines are as expected.

I have checked in the header of both the bam files that both are sorted by coordinate.

So, could anyone tell me why the size of the file containing the lesser number of lines should be so much larger?

ADD COMMENTlink modified 6 months ago by michael.ante3.6k • written 6 months ago by ilovesuperheroes19930
gravatar for michael.ante
6 months ago by
michael.ante3.6k wrote:


Without the -b option, you'll get a SAM file which is not compressed. Adding -b and -h to your command, will produce a valid and compressed BAM file.



ADD COMMENTlink written 6 months ago by michael.ante3.6k

Agreed. Still in the most recent samtools versions you would not even need to set any flags as it recognizes file format based on the suffix if you use -o instead of redirecting stdout like samtools view -q 255 -o unique_reads.bam input_file.bam. WIth your current command you produced a SAM instead of BAM file without a header as -h was missing. When using -b then -h is implied.

ADD REPLYlink written 6 months ago by ATpoint30k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1484 users visited in the last hour