Explanation of BAM compression levels
1
0
Entering edit mode
18 months ago
joe ▴ 510

Hi Biostars,

Can someone point me to documentation, or describe here, what the practical differences are for different compression levels of BAM files and why you might choose one over another? For example, in bedtools functions you can pass -ubam, or with picard you can pass --COMPRESSION_LEVEL, among others...

compression BAM • 856 views
ADD COMMENT
2
Entering edit mode
18 months ago
LChart 3.9k

BAM files are block-gzip compressed (https://manpages.ubuntu.com/manpages/impish/man1/bgzip.1.html), so the "compression level" refers precisely to the gzip compression level used on each individual block (https://www.rootusers.com/gzip-vs-bzip2-vs-xz-performance-comparison/ for actual impact).

ADD COMMENT
0
Entering edit mode

Thank you for pointing to these pages, do you know why you would use one compression level over another? For example, high compression for files that will be stored vs no compression for files regularly accessed? Is there any guidance for at what point (file size, access frequency, etc) different compression levels should be considered?

ADD REPLY
3
Entering edit mode

For archival purposes, you're better off using CRAM than BAM; and for individual bam files the difference in performance and space for various compression levels is, in my experience, negligible.

ADD REPLY

Login before adding your answer.

Traffic: 2179 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6