snps/indels with individual genotypes from 1000 genomes ftp site
4.0 years ago
lait ▴ 160

Sorry if this might be a trivial question!

I read a lot about this until I got lost. I need to download wgs VCF file from the 1000 genomes ftp site. I need the snps (snvs and indels), most importantly, I need to have the individual genotypes of all the persons involved.

so for example, this file :

which was referenced many times on biostars, does not contain individual genotypes. I need something similar to what those files contain. Is there one global file containing snps/indels for wgs data including genotypes of the various samples ?

thanks!

4.0 years ago

You can download the entire data per chromosome (chr1-22 & chrX) —including individual genotypes for both indels and SNPs— using this code:

prefix="ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr" ;

suffix=".phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz" ;

for chr in {1..22} X; do
wget $prefix$chr$suffix$prefix$chr$suffix.tbi ;
done


Kevin

I can't get the files, it says the host is not resolvable. I tried also from the NCBI website, none of the pages can be opened. Is there another way to download the human vcf files directly from the terminal?

I can connect - I did it just now 10 seconds ago. To where are you downloading the data?

I tried several, the only page that opened is http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/. The other were ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/ and from the links provided from https://www.ncbi.nlm.nih.gov/variation/docs/human_variation_vcf/, http://www.internationalgenome.org/data#download.

thank you, but also these are giving me time out errors. But it worked really fast with \$ wget http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr21.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz.tbi. Is there some problems with the server? Are these sites pointing at the same data?

That file is a tab-index file, which is very small; so, it will download very quickly in most places unless you are using a dial-up modem of 7.5kbps (or less).

Let's just try chr1 variants, first:

wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz