Question: How to download VCF of 1000 genome project with population frequencies?
0
gravatar for bioinforesearchquestions
17 months ago by
United States
bioinforesearchquestions270 wrote:

Hi folks,

I would like to grab the population frequencies for a list of SNPs from 1000 genome project (https://www.ncbi.nlm.nih.gov/variation/tools/1000genomes/ ).

Currently, I am searching one snp at a time and downloading as a VCF. Is there a way for me to search my list and get the entire population data for my SNPs?

Do we have any commandline tool to handle this requirement?

1000genome snps rsids vcf • 1.3k views
ADD COMMENTlink modified 17 months ago by finswimmer13k • written 17 months ago by bioinforesearchquestions270
1
gravatar for chrchang523
17 months ago by
chrchang5236.7k
United States
chrchang5236.7k wrote:

The three boldfaced links at https://www.cog-genomics.org/plink/2.0/resources#1kg_phase3 provide one solution. With that dataset downloaded, plink2 can report allele frequencies for any predefined population, as well as any population you define yourself. (You can also export VCF files from it, and subsetting is frequently >10x faster than bcftools.)

ADD COMMENTlink modified 17 months ago • written 17 months ago by chrchang5236.7k
1
gravatar for sf21
17 months ago by
sf2110
New York
sf2110 wrote:

I would suggest downloading the sites VCF from here, http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ and using bcftools to subset it to SNPs of interest using the -R or -T flag.

Link to the sites file: http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz.tbi

ADD COMMENTlink modified 17 months ago • written 17 months ago by sf2110
1
gravatar for finswimmer
17 months ago by
finswimmer13k
Germany
finswimmer13k wrote:

Hello bioinforesearchquestions ,

if you are able to code use ensembl's REST-API, especially the variation endpoint. There also an endpoint for query multiple IDs at once.

fin swimmer

ADD COMMENTlink written 17 months ago by finswimmer13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1707 users visited in the last hour