Convert fastq.gz to fastq.bgz
2
2
Entering edit mode
4.9 years ago
caspase8mach ▴ 20

Hello,

Could someone please explain the difference between .gz and .bgz format - what are the advantages of one format over the other?

I would also like to know a way to convert fastq.gz to fastq.bgz files (and vice versa).

Thanks in advance for the help.

  • Akhil
NGS FASTQ .gz .bgz • 9.0k views
ADD COMMENT
4
Entering edit mode
4.9 years ago

A .bgz file is a "block gzipped" file, which can be thought of as a special kind of gzipped file. The main differences are summarized here by Peter Cock and boil down to bgz files having entries compressed in blocks. This is mostly useful for random access, which isn't commonly needed with fastq files. In theory one could use bgz files more efficiently with cloud-based aligners, but since those aren't exactly commonly used...

To convert from one to the other: zcat foo.gz | bgzip -c > foo.bgz or the reverse. Having said that, pretty much anything that handles gzipped files can handle block gzipped files (except for Java programs).

ADD COMMENT
0
Entering edit mode

(except for Java programs

why ?

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Yup, you beat me by a minute :)

ADD REPLY
0
Entering edit mode

Good question. Apparently the default gzip implementation used to not be able to handle that. I know that was an issue until at least a year or two ago, not sure if it still is.

ADD REPLY
0
Entering edit mode

To convert from one to the other: zcat foo.gz | gzip -c > foo.bgz or the reverse.

Just an addition, but apparently bgzip can decompress gzipped files, and gzip can decompress bgzipped files.

ADD REPLY
1
Entering edit mode

Correct, and I forgot a b in bgzip.

ADD REPLY
0
Entering edit mode

Is it faster than bcftools view my.vcf.gz -Oz -o my_bgziped.vcf.gz ?

ADD REPLY
1
Entering edit mode

Yes, that actually parses the file, which is unnecessary when just running it through bgzip

ADD REPLY
0
Entering edit mode

I tested it quickly and it is a lot faster (talkings days for bcftools compared to just a couple of hours with zcat)

ADD REPLY
2
Entering edit mode
4.9 years ago
h.mon 34k

Read a detailed comparison at BGZF - Blocked, Bigger & Better GZIP! .

Summary: gzip is widely available on any Unix / Linux system, bgzip isn't; bgzip produces bigger files than gzip; however bgzip does allow for much faster random access than gzip.

ADD COMMENT

Login before adding your answer.

Traffic: 1481 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6