Question: Size of BAM file reduces after sorting with samtools
5.1 years ago by
shakeelbiochemist10 wrote:

I have 3 BAM files of the same specie, each of ~7GB, from three experimental runs. I merged the three BAM files to produce a single 22 GB bam file, using samtools merge -r option. Then I sorted this merged bam file with samtools sort, and i got 11 GB merged bam. Is is possible to reduce the size of merged bam file by 50%??

yes         .

You can use samtools flagstat .bam to check read counts etc. for the different files.

5.1 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

When you sort by coordinate, you bring reads with similar sequences next to each other, allowing the compression algorithm to see more compressible content.  It is worth checking, though, that the number of sequences is what you expect using `samtools flagstat` or simply `samtools view` and a `wc -l`.



