1000 Genomes download files
0
0
Entering edit mode
6 months ago
Barista ▴ 10

Hi!

I would like to download VCF files for a certain population from the 1000 Genomes site. However, I would like to do this for around 100 people and each sample that I found in the data section, has .vcf.gz for each chromosome. My question is: is there a way to download them all at once, all files for all people? Can I maybe use an API for this somehow? I also have difficulties unzipping the .vcf.gz files.

I highly appreciate all help!

1000genomes vcf • 530 views
ADD COMMENT
2
Entering edit mode

If your question is from where you can download all vcf files for all individuals (and populations) from the 1000G project - the last time I checked with the helpdesk of Ensembl, they pointed me to this ftp server link. I was also working on these files a while ago and I also remember thinking of using REST-API feature, but if recall correctly - I had errors when requesting for files in bulk (specific genomic intervals + specific population) and I decided it was best that I download the files and make local queries using tools like vcftools and bcftools instead.

On your question of having difficulties unzipping vcf.gz files - I am not sure if I would do that if I were you since most of the tools handling vcf files do take in .vcf.gz files too. For example, this is how you would read the vcf.gz files in vcftools : vcftools --gzvcf input_file.vcf.gz . Have a look at the vcftools manual for more. For example, if I remember correctly, you can use --keep <filename> to subset the vcf file to keep only selected individuals

ADD REPLY
0
Entering edit mode

Thank you a lot, I will try it! I also tried to use the REST-API before but with errors as well. I will try again with the vcftools and bcftools, thanks!

ADD REPLY

Login before adding your answer.

Traffic: 1444 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6