Question: Submit High-Throughput data to GEO. Problems with FTP connection
0
gravatar for mgdrnl
8 months ago by
mgdrnl10
Barcelona
mgdrnl10 wrote:

I open this question because I didn't find much information about this topic so far. I am trying to upload data from a RNA-seq project to GEO (387,5GB) with UNIX command line and I am getting the error:

 Lost data connection to remote host after 1xxxxxxxx bytes had been sent: Broken pipe.

The number of bytes being variable each time. After asking the IT service in my institution, they told me that the FTP protocol is very slow and the broken connection is expected for such big files.

I solved The issue using the scipt in this post

However, it will be helpful if anybody can share other answers to this problem, maybe also to improve the speed, as it is taking a lot of time to submit all files.

Thanks a lot!

next-gen ftp geo • 418 views
ADD COMMENTlink modified 8 months ago by arup1.4k • written 8 months ago by mgdrnl10

That does not sound like a bioinformatics question to me!

ADD REPLYlink written 8 months ago by Vijay Lakhujani4.1k

Submit your data to ArrayExpress, it has a better interface for metadata management and file uploading. The direct FTP connection is also fast, accession ids are provided within a couple of hours and with a week they will provide the reviewer account details.

ADD REPLYlink written 8 months ago by arup1.4k

Thanks! I will definitely try ArrayExpress next time.

ADD REPLYlink written 8 months ago by mgdrnl10
3
gravatar for Michael Dondrup
8 months ago by
Bergen, Norway
Michael Dondrup46k wrote:

Are you sure you want to submit RNA-seq data (raw data?) to GEO? You should submit to SRA instead. NCBI supports upload via Aspera connect (ascp with very similar command line interface to scp) which is faster and more robust against interrupted network connection. See : https://www.ncbi.nlm.nih.gov/books/NBK242625/ and https://www.ncbi.nlm.nih.gov/sra/docs/submit/

ADD COMMENTlink modified 8 months ago • written 8 months ago by Michael Dondrup46k

Thank you very much for the recommendation! the truth is I didn't know about SRA. Unfortunately the journal I am submitting the paper asks me to deposit the data (raw data, fastq files) to GEO or ArrayExpress. I will try with ArrayExpress next time, as someone above said it is better.. unless the journal changes their policy.

ADD REPLYlink written 8 months ago by mgdrnl10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1869 users visited in the last hour