Gzip output of fasterq-dump
1
4
Entering edit mode
8 months ago

Hello everyone,

I have always used fastq-dump to download raw data from the SRA, with the caveat that it was very slow. I recently switched to fasterq-dump, which is great in terms of speed, but its inability to gzip the fastq files on the fly is causing me lots of issues (the uncompressed fastq files are just too big for the system I am using).

I know fasterq-dump does not allow any gzipping of its output. Is there, however, any trick I could use to gzip these files before they take so much space? I tried piping a gzip command but that did not work. I have a suspicion this is not possible, but I had to give it a try.

Thanks so much!

SRA fasterq-dump • 1.6k views
2
Entering edit mode
8 months ago

I wonder why fasterq-dump doesn't have the gzip option, which was nice indeed. Anyway, in the past I had good experience with parallel-fastq-dump, something like:

parallel-fastq-dump --tmpdir . --threads 8 --gzip --split-files --sra-id SRA1234


But if I can, I avoid SRA and download from ENA using curl which is more transparent than sratools.