Question: Method to Check Fastq Completeness after Fastq-dump
4
gravatar for Shicheng Guo
2.1 years ago by
Shicheng Guo7.4k
Shicheng Guo7.4k wrote:

Hi All,

What's your method to check the completeness of the fastq file after the download by fastq-dump from SRA database? I always find some non-completeness fastqs after the fastq-dump.

Thanks.

completeness fastq-dump • 1.4k views
ADD COMMENTlink modified 10 weeks ago by ATpoint13k • written 2.1 years ago by Shicheng Guo7.4k
2

You should always check EBI-ENA to see if fastq files are available. For the SRR# you posted below.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by genomax62k

see How can I find SRA MD5 checksums for FASQ files?

ADD REPLYlink written 2.1 years ago by Pierre Lindenbaum117k
2

By the way: how to deal with Resume Broken Download Problem for fastq-dump ?

ADD REPLYlink written 2.1 years ago by Shicheng Guo7.4k

17 months ago and no answer to thais question, i have the same issue here when dumping big files (~30G) and don't want to restart downloading, how to resume browken download with fast-dump? best

ADD REPLYlink written 8 months ago by Samad90
1

Thanks. The method you mention works in some way. However, for the majority situation, it doesn't work. for example:

fastq-dump --split-files --gzip SRR949203

if you just download the SRA files, I think it is okay to use

 vdb-validate SRR949203
ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Shicheng Guo7.4k
2
gravatar for ATpoint
10 weeks ago by
ATpoint13k
Germany
ATpoint13k wrote:

Just to update this, it is not recommended to use fastq-dump for downloads. It is slow and prone to connection losses. Better use prefetch together with Aspera, see here, to get the SRA files, and then use fastq-dump to convert to fastq. Still, you can get most data directly from the European Nucleotide Archive in fastq format. Downloading from there is pretty simple and fast, see my tutorial on that: Fast download of FASTQ files and metadata from the European Nucleotide Archive (ENA) . If you have to download from NCBI, e.g. because data are restricted, go with prefetch followed by parallel-fastq-dump, which is a wrapper for parallelizing fastq-dump. After successfully converting a sra to fastq, both tools (fastq-dump/parallel-fastq-dump) print a summary message that only shows up if no errors occurred, so I never felt the need to verify the fastq file after converting from sra, given that message was printed.

ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by ATpoint13k

Hi ATpoint, How to apply Aspera in Linux server?

ADD REPLYlink written 10 weeks ago by Shicheng Guo7.4k

It is covered in Fast download of FASTQ files and metadata from the European Nucleotide Archive (ENA)

ADD REPLYlink written 10 weeks ago by ATpoint13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 738 users visited in the last hour