1000G Integrated Call Data File Size
1
0
Entering edit mode
10.5 years ago
Pierre ▴ 130

Hi,

The following two sites show substantially different sizes for the files including the integrated calls per chromosome.

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/

For example, integrated calls for chromosome 1:

ftp: 10/12/12 12:00 [GMT] 4,294,967,295 ALL.chr1.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf.gz

http: 12-Oct-2012 13:52 [GMT] 11G ALL.chr1.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf.gz

Any idea why this is the case?

When I try to download the larger files from the http link, the download is terminated after the first gigabyte with an error message indicating that the source could not be found.

1000genomes • 2.0k views
ADD COMMENT
2
Entering edit mode
10.5 years ago
Laura ★ 1.8k

These data is sourced from exactly the same disk so there should be no difference in the files

The best way to be certain you have we had is to check the md5 which you can always find in our current tree file

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/current.tree

If you are consistently seeing errors from our http mount please can you send your command line, errors and external IP to info@1000genomes.org and we can get our sysadmins to look into it

ADD COMMENT
0
Entering edit mode

Thank you Laura for this reply as well as the previous comments on the 1000G related questions, I really appreciate your help. In that case, I will refer to the md5 to validate the files that I have downloaded.

ADD REPLY

Login before adding your answer.

Traffic: 2368 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6