Question: SRAtoolkit: error while running fastq-dump
1
gravatar for arnstrm
3.1 years ago by
arnstrm1.6k
Ames, IA
arnstrm1.6k wrote:

Hi,

I am trying to download few SRA files from NCBI SRA database using SRAtoolkit. The command I am running is

fastq-dump --origfmt --gzip --split-files --accession SRR447946

But after a while of running (downloading), it gives an error:

=============================================================
An error occurred during processing.
A report was generated into the file '/home/arnstrm/ncbi_error_report.xml'.
If the problem persists, you may consider sending the file
to 'sra@ncbi.nlm.nih.gov' for assistance.
=============================================================

2014-10-20T16:04:57 fastq-dump.2.3.4 err: file descriptor invalid while constructing file within file system module - curl_easy_perform( FileRead, 1057095680.131072 ) failed with curl-error 'CURLE_COULDNT_CONNECT' (7)
2014-10-20T16:05:08 fastq-dump.2.3.4 err: file descriptor invalid while constructing file within file system module - curl_easy_perform( FileRead, 234749952.131072 ) failed with curl-error 'CURLE_OPERATION_TIMEDOUT' (28)
2014-10-20T16:05:08 fastq-dump.2.3.4 err: buffer insufficient while executing function within transform module - failed SRR447946

Does anyone know what is the problem here? Quick Google search didn't find any solutions.

Thanks for any help!

sra sratoolkit ncbi • 6.8k views
ADD COMMENTlink modified 3.1 years ago by Renesh1.1k • written 3.1 years ago by arnstrm1.6k
0
gravatar for Renesh
3.1 years ago by
Renesh1.1k
United States
Renesh1.1k wrote:

You are getting error CURLE_COULDNT_CONNECT; it means your connection to NCBI server is failed. Check you connection and port. retry the download. When I tried on my system, it is working fine.

ADD COMMENTlink written 3.1 years ago by Renesh1.1k

If you still getting the error, download those manually by going to SRA site http://www.ncbi.nlm.nih.gov/sra/?term=SRR447946

ADD REPLYlink written 3.1 years ago by Renesh1.1k

It is on HPC. So you think the compute nodes might be having connection issues? I haven't had problems downloading the files before, but I will write to the sysadmin to see if they can solve this problem.

Thanks for the tip!

 

ADD REPLYlink written 3.1 years ago by arnstrm1.6k
1

On HPC, this should work;

wget ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR447/SRR447946/SRR447946.sra

ADD REPLYlink written 3.1 years ago by Renesh1.1k

It's quite possible that the worker nodes are walled off from connecting to the rest of the net. Use wget as PyPerl said and then use fastq-dump and you should then get the files you want.

BTW, you can save time and just get this dataset from ENA. Then you don't have to deal with the annoyance (and time) of converting to fastq.

ADD REPLYlink written 3.1 years ago by Devon Ryan73k

Great, thanks for EBI link. I will try that!

 

ADD REPLYlink written 3.1 years ago by arnstrm1.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 597 users visited in the last hour