Question: Submit High-Throughput data to GEO. Problems with FTP connection
0
gravatar for mgdrnl
2.2 years ago by
mgdrnl10
Barcelona
mgdrnl10 wrote:

I open this question because I didn't find much information about this topic so far. I am trying to upload data from a RNA-seq project to GEO (387,5GB) with UNIX command line and I am getting the error:

 Lost data connection to remote host after 1xxxxxxxx bytes had been sent: Broken pipe.

The number of bytes being variable each time. After asking the IT service in my institution, they told me that the FTP protocol is very slow and the broken connection is expected for such big files.

I solved The issue using the scipt in this post

However, it will be helpful if anybody can share other answers to this problem, maybe also to improve the speed, as it is taking a lot of time to submit all files.

Thanks a lot!

next-gen ftp geo • 1.5k views
ADD COMMENTlink modified 2.2 years ago by Arup Ghosh2.7k • written 2.2 years ago by mgdrnl10

That does not sound like a bioinformatics question to me!

ADD REPLYlink written 2.2 years ago by lakhujanivijay5.3k

Submit your data to ArrayExpress, it has a better interface for metadata management and file uploading. The direct FTP connection is also fast, accession ids are provided within a couple of hours and with a week they will provide the reviewer account details.

ADD REPLYlink written 2.2 years ago by Arup Ghosh2.7k

Thanks! I will definitely try ArrayExpress next time.

ADD REPLYlink written 2.2 years ago by mgdrnl10
3
gravatar for Michael Dondrup
2.2 years ago by
Bergen, Norway
Michael Dondrup48k wrote:

Are you sure you want to submit RNA-seq data (raw data?) to GEO? You should submit to SRA instead. NCBI supports upload via Aspera connect (ascp with very similar command line interface to scp) which is faster and more robust against interrupted network connection. See : https://www.ncbi.nlm.nih.gov/books/NBK242625/ and https://www.ncbi.nlm.nih.gov/sra/docs/submit/

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Michael Dondrup48k

Thank you very much for the recommendation! the truth is I didn't know about SRA. Unfortunately the journal I am submitting the paper asks me to deposit the data (raw data, fastq files) to GEO or ArrayExpress. I will try with ArrayExpress next time, as someone above said it is better.. unless the journal changes their policy.

ADD REPLYlink written 2.2 years ago by mgdrnl10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1124 users visited in the last hour