Convert fastq.gz to fastq.bgz
2
2
Entering edit mode
4.9 years ago
caspase8mach ▴ 20

Hello,

Could someone please explain the difference between .gz and .bgz format - what are the advantages of one format over the other?

I would also like to know a way to convert fastq.gz to fastq.bgz files (and vice versa).

Thanks in advance for the help.

• Akhil
NGS FASTQ .gz .bgz • 9.0k views
4
Entering edit mode
4.9 years ago

A .bgz file is a "block gzipped" file, which can be thought of as a special kind of gzipped file. The main differences are summarized here by Peter Cock and boil down to bgz files having entries compressed in blocks. This is mostly useful for random access, which isn't commonly needed with fastq files. In theory one could use bgz files more efficiently with cloud-based aligners, but since those aren't exactly commonly used...

To convert from one to the other: zcat foo.gz | bgzip -c > foo.bgz or the reverse. Having said that, pretty much anything that handles gzipped files can handle block gzipped files (except for Java programs).

0
Entering edit mode

(except for Java programs

why ?

1
Entering edit mode
0
Entering edit mode

Yup, you beat me by a minute :)

0
Entering edit mode

Good question. Apparently the default gzip implementation used to not be able to handle that. I know that was an issue until at least a year or two ago, not sure if it still is.

0
Entering edit mode

To convert from one to the other: zcat foo.gz | gzip -c > foo.bgz or the reverse.

Just an addition, but apparently bgzip can decompress gzipped files, and gzip can decompress bgzipped files.

1
Entering edit mode

Correct, and I forgot a b in bgzip.

0
Entering edit mode

Is it faster than bcftools view my.vcf.gz -Oz -o my_bgziped.vcf.gz ?

1
Entering edit mode

Yes, that actually parses the file, which is unnecessary when just running it through bgzip

0
Entering edit mode

I tested it quickly and it is a lot faster (talkings days for bcftools compared to just a couple of hours with zcat)

2
Entering edit mode
4.9 years ago
h.mon 34k

Read a detailed comparison at BGZF - Blocked, Bigger & Better GZIP! .

Summary: gzip is widely available on any Unix / Linux system, bgzip isn't; bgzip produces bigger files than gzip; however bgzip does allow for much faster random access than gzip.