Forum:Discussion: Uploading terabytes of data to NCBI SRA
2
0
Entering edit mode
9 months ago
James Reeve ▴ 130

Large genomic datasets are becoming increasingly common, and often we need to find a place to archive and share data once our project is done. Probably the most widely used archive is NCBI's Short Read Archive (SRA). However, if you ever tired to use this for large data sets, their ftp upload option is a pain. There are frequent drop-outs and you only have a few minutes from logging in to navigate to your directory and start to upload. In short, I find data archiving a very frustrating experience.

I want to get a discussion going about the tricks and tips for streamlining this process. Especially, I'd like to know how one connects a remote server to NCBI.

NCBI WGS archiving servers remote • 1.4k views
ADD COMMENT
2
Entering edit mode
9 months ago
GenoMax 142k

If you have a need to upload (tens or more) terabytes of data to NCBI then you need to directly reach out to SRA support and work out a solution.

Otherwise using Aspera connect for upload should be the solution that is preferred: https://www.ncbi.nlm.nih.gov/sra/docs/submitfiles/ Uploads in theory would only be limited by bandwidth your institutions allows you to use with Aspera (since I would imagine that NCBI has access to larger networking pipes than you probably do).

Looks like they also provide uploads from Amazon S3 buckets.

ADD COMMENT
1
Entering edit mode
9 months ago

I've never had a problem with uploading to the SRA (both within and outside the US) for hundreds of GB of data - you might need to check with your internet provider if frequent drop-outs are a problem

ADD COMMENT
0
Entering edit mode

I'm based at a field station, so drop-outs are unfortunately unavoidable. I want to use the remote server since it's based in a major city.

ADD REPLY
0
Entering edit mode

If your data is at located at the remote server then there should be no issues with dropouts. Sounds like the problem may be with your link from the field station (is the data being generated there) to the remote server?

ADD REPLY

Login before adding your answer.

Traffic: 2341 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6