Question: Why does submitting high-throughput sequence data to GEO from an Amazon EC2 instance produce the error "Could not read reply from control connection -- timed out" ?
0
gravatar for Daniel Gerlach
12 weeks ago by
Austria
Daniel Gerlach20 wrote:

I am following the instruction mentioned on Submit to GEO to upload about 83G of RNA-seq data in gzipped form to the GEO FTP server. I was first using the following command, but the connection had a time-out after every file:

ncftpput -B 33554432 -z -u 'username' -p 'password' -v -R \
  ftp-private.ncbi.nlm.nih.gov /fasp/ local_folder_to_upload

I then extended this to the following script, such that it retries until all files are uploaded:

#!/bin/bash
cd /home/ec2-user

try=0
COMPLETE_CONDITION=0

echo "START"

until [ "$lastresult" = "$COMPLETE_CONDITION" ]; do
  let "try+=1"
  echo "Try $try ..."
  ncftpput -B 33554432 -z -u 'username' -p 'password' -v -R \
    ftp-private.ncbi.nlm.nih.gov /fasp/ local_folder_to_upload
  let "lastresult=$?"
  echo "Last Resultcode: $lastresult"
done

echo "UPLOAD COMPLETED AFTER $try TRY(S)"

exit 0

Which worked in principal and after several tries I got all samples uploading correctly on GEO. However the error message persisted:

Could not read reply from control connection -- timed out.

Any thoughts on why this happens and how to resolve it? I does not look to be crucial as all files seem to be uploaded correctly.

next-gen ftp geo • 182 views
ADD COMMENTlink written 12 weeks ago by Daniel Gerlach20

This might not really solve the problem but: Does geo have a way to get some hash like sha or md5? If the checksum is ok, I would not bother too much. I would just want to make sure the files are not truncated.

ADD REPLYlink written 12 weeks ago by Michael Dondrup43k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 564 users visited in the last hour