Question: How to download VCF of 1000 genome project with population frequencies?
0
gravatar for bioinforesearchquestions
2.1 years ago by
United States
bioinforesearchquestions280 wrote:

Hi folks,

I would like to grab the population frequencies for a list of SNPs from 1000 genome project (https://www.ncbi.nlm.nih.gov/variation/tools/1000genomes/ ).

Currently, I am searching one snp at a time and downloading as a VCF. Is there a way for me to search my list and get the entire population data for my SNPs?

Do we have any commandline tool to handle this requirement?

1000genome snps rsids vcf • 2.3k views
ADD COMMENTlink modified 2.1 years ago by finswimmer14k • written 2.1 years ago by bioinforesearchquestions280
1
gravatar for chrchang523
2.1 years ago by
chrchang5237.4k
United States
chrchang5237.4k wrote:

The three boldfaced links at https://www.cog-genomics.org/plink/2.0/resources#1kg_phase3 provide one solution. With that dataset downloaded, plink2 can report allele frequencies for any predefined population, as well as any population you define yourself. (You can also export VCF files from it, and subsetting is frequently >10x faster than bcftools.)

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by chrchang5237.4k
1
gravatar for sf21
2.1 years ago by
sf2110
New York
sf2110 wrote:

I would suggest downloading the sites VCF from here, http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ and using bcftools to subset it to SNPs of interest using the -R or -T flag.

Link to the sites file: http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz.tbi

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by sf2110
1
gravatar for finswimmer
2.1 years ago by
finswimmer14k
Germany
finswimmer14k wrote:

Hello bioinforesearchquestions ,

if you are able to code use ensembl's REST-API, especially the variation endpoint. There also an endpoint for query multiple IDs at once.

fin swimmer

ADD COMMENTlink written 2.1 years ago by finswimmer14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1959 users visited in the last hour