Question: Any fast way to download 1000 Genome Phase 3?
0
gravatar for b.ambrozio
11 days ago by
b.ambrozio0
b.ambrozio0 wrote:

Hello! I'm trying to download the 1000 Genomes (phase 3) through Aspera, but the instructions at the documentation don't work. Via command line (using ascp) I get the error: ERR [ascp] SSH authentication failed. Eg:

ascp -i /home/ibmuser/.aspera/connect/etc/asperaweb_id_dsa.putty -Tr -Q -l 100M -P33001 -L- fasp-g1k@fasp.1000genomes.ebi.ac.uk:vol1/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz.tbi ./

Via Aspera Desktop I face: "SSH_MSG_DISCONNECT: 2 Too many authentication failures"

I have FTP in progress, but it's taking ages.

Thanks!

fasp aspera 1000genomes • 82 views
ADD COMMENTlink modified 11 days ago • written 11 days ago by b.ambrozio0
2

I have tried the command on my computer and the file has been successfully downloaded, which version of ascp are you using? Is the private-key-file path correct?

ADD REPLYlink written 11 days ago by yztxwd170

croos posted: https://bioinformatics.stackexchange.com/questions/10739

ADD REPLYlink written 11 days ago by Pierre Lindenbaum124k
1
gravatar for b.ambrozio
11 days ago by
b.ambrozio0
b.ambrozio0 wrote:

Ok, I got it working. The change was pretty much the -i parameter that in my case had to be for the new version of the ascp: asperaweb_id_dsa.putty. That's funny as yesterday I'm pretty sure I tried and didn't work (error "Too many authentication failures". I'm guessing the credential were blocked, or so...). Anyway, here's a script I've coded to download everything at once:

echo "Start: date"

FASP_ADDRESS="/home/ibmuser/.aspera/connect/etc/asperaweb_id_dsa.openssh -Tr -Q -l 100M -P33001 -L- fasp-g1k@fasp.1000genomes.ebi.ac.uk:vol1/ftp/release/20130502"

for CHR in $(seq 1 22); do
FILE=ALL.chr$CHR.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz echo "Downloading '$FILE'..." echo "ascp -i $FASP_ADDRESS/$FILE ./" done

echo "End: date"

It's available for download from my Github too (with along an FTP version, if you will): https://github.com/bambrozio/bioinformatics/tree/master/utils

Thanks!

ADD COMMENTlink written 11 days ago by b.ambrozio0

The downloads from https://www.cog-genomics.org/plink/2.0/resources#1kg_phase3 are ~70% smaller, and contain all the information in the VCFs (“plink2 —pfile ... —export vcf bgz” can be used to generate actual VCFs).

ADD REPLYlink written 11 days ago by chrchang5235.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1482 users visited in the last hour