I am trying to download 20TB of metagenomic data from ENA. However their standard ftp service hardly ever reaches 400KB/s and it would take me literally ages to download it all
I've been looking into the Aspera alternative, but am currently facing problems with it.
I've installed it and tried to download ERR1729192 run:
ascp -v -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh -k 1 -T -l 100M -L- fasp@fasp.sra.ebi.ac.uk/vol1/fastq/ERR172/002/ERR1729192/ERR1729192.fastq.gz ./
And I get the following log:
LOG Configuration: using v2 configuration file "/home/galkin/.aspera/connect/etc/aspera.conf", user -
LOG Initializing FASP version 3.8.1.161147, license max rate=(unlimited), account no.=1, license no.=1 product=6
LOG Configured symlink actions: create=1, follow=1, follow_wide=0, skip=0
LOG Aspera Connect version 3.8.1.161274
LOG Alternate log directory: "-"
ascp: no remote host specified
Startup failed, exit
ERR no remote host specified
ERR Missing mandatory arguments
LOG FASP Session Start uuid=cfe1b5c8-890d-40c8-a93f-466f37d0b51c op=recv status=failed errcode=106 errstr="Usage error" source= dest= source_prefix=- local=0.0.0.0:0 peer=0.0.0.0:0 tcp_port=22 os="Linux 4.4.0-62-generic #83-Ubuntu SMP We" ver=3.8.1.161147 lic=6:1:1 peeros="-" peerver=- peerlic=0:0:0 proto_sess=20003 proto_udp=20000 proto_bwmeas=20000 proto_data=2000d
LOG FASP Session Params uuid=cfe1b5c8-890d-40c8-a93f-466f37d0b51c userid=0 user="-" targetrate=10000000 minrate=0 rate_policy=fair cipher=none resume=0 create=0 ovr=0 times=0 precalc=no mf=0 mf_path=- mf_suffix=.aspera-inprogress partial_file_suffix=- files_encrypt=no files_decrypt=no file_csum=NONE dgram_sz=1492 prepostcmd=- tcp_mode=no rtt_auto=no cookie="-" vl_proto_ver=1 peer_vl_proto_ver=0 vl_local=0 vlink_remote=0 vl_sess_id=0 srcbase=- rd_sz=0 wr_sz=0 cluster_num_nodes=1 cluster_node_id=0 cluster_multi_session_threshold=-1 range=0-0 keepalive=no test_login=no proxy_ip=- net_rc_alg=alg_none exclude_older/newer_than=0/0
LOG FASP Session Stop uuid=cfe1b5c8-890d-40c8-a93f-466f37d0b51c op=recv status=failed errcode=106 errstr="Usage error" source= dest= source_prefix=- local=0.0.0.0:0 peer=0.0.0.0:0 tcp_port=22 os="Linux 4.4.0-62-generic #83-Ubuntu SMP We" ver=3.8.1.161147 lic=6:1:1 peeros="-" peerver=- peerlic=0:0:0 proto_sess=20003 proto_udp=20000 proto_bwmeas=20000 proto_data=2000d
LOG FASP Session Params uuid=cfe1b5c8-890d-40c8-a93f-466f37d0b51c userid=0 user="-" targetrate=10000000 minrate=0 rate_policy=fair cipher=none resume=0 create=0 ovr=0 times=0 precalc=no mf=0 mf_path=- mf_suffix=.aspera-inprogress partial_file_suffix=- files_encrypt=no files_decrypt=no file_csum=NONE dgram_sz=1492 prepostcmd=- tcp_mode=no rtt_auto=no cookie="-" vl_proto_ver=1 peer_vl_proto_ver=0 vl_local=0 vlink_remote=0 vl_sess_id=0 srcbase=- rd_sz=0 wr_sz=0 cluster_num_nodes=1 cluster_node_id=0 cluster_multi_session_threshold=-1 range=0-0 keepalive=no test_login=no proxy_ip=- net_rc_alg=alg_none exclude_older/newer_than=0/0
LOG FASP Session Statistics [Receiver] id=cfe1b5c8-890d-40c8-a93f-466f37d0b51c delay=0ms rex_delay=0ms ooo_delay=0ms solicited_rex=0.00% rcvd_rex=0.00% rcvd_dups=0.00% ave_xmit_rate 0.00Kbps effective=0.00% effective_rate=0.00Kbps (detail: good_blks 0 bl_total 0 bl_orig 0 bl_rex 0 dup_blks 0 dup_last_blks 0 drop_blks_xnf 0) (sndr ctl: sent 0 rcvd 0 lost 0 lost 0.00%) (rcvr ctl: sent 0 rcvd 0 lost 0 lost 0.00%) (rex ctl: sent 0 rcvd 0 lost 0 lost 0.00%) (progress: tx_bytes 0 file_bytes 0 tx_time 0) rex_xmit_blks 0 xmit_total 0 rex_xmit_pct 0.00%
Using enaBrowserTools does not help:
enaDataGet -a -f fastq ERR1729192
Downloading file with md5 check:fasp.sra.ebi.ac.uk:/vol1/fastq/ERR172/002/ERR1729192/ERR1729192.fastq.gz
/home/galkin/.aspera/connect/bin -QT -L /home/galkin/ERR1729192/logs -l 100M -P33001 -i /home/galkin/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/ERR172/002/ERR1729192/ERR1729192.fastq.gz ./ERR1729192
/bin/sh: 1: /home/galkin/.aspera/connect/bin: Permission denied
Error with Aspera transfer: [Errno 2] No such file or directory: './ERR1729192/ERR1729192.fastq.gz'
Downloading file with md5 check:fasp.sra.ebi.ac.uk:/vol1/fastq/ERR172/002/ERR1729192/ERR1729192.fastq.gz
/home/galkin/.aspera/connect/bin -QT -L /home/galkin/ERR1729192/logs -l 100M -P33001 -i /home/galkin/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/ERR172/002/ERR1729192/ERR1729192.fastq.gz ./ERR1729192
/bin/sh: 1: /home/galkin/.aspera/connect/bin: Permission denied
Error with Aspera transfer: [Errno 2] No such file or directory: './ERR1729192/ERR1729192.fastq.gz'
Failed to download file after two attempts
Deleting directory ERR1729192
ERROR: Something unexpected went wrong please try again.
If problem persists, please contact datasubs@ebi.ac.uk for assistance.
I have contacted our administrator and he says that our firewall settings are ok.
Meanwhile, when I am using this example I have found on the Internet, everything works fine and I can download at 90MB/s
ascp -QTr -l 400M -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh anonftp@ftp-private.ncbi.nlm.nih.gov:/sra/sra-instant/reads/ByStudy/sra/SRP/SRP009/SRP009247 .
Has anybody faced such a problem? What permission am I lacking? Is that a problem on my server side, or I need to ask for some kind of permission from ENA to use Aspera? What does "no remote host specified" mean?
You're missing a colon after the server name:
fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/ERR172/002/ERR1729192/ERR1729192.fastq.gz
the link is err. ascp need special link. dot using copy form NCPBI !