Question: How to download VCF of 1000 genome project with population frequencies?
0
gravatar for bioinforesearchquestions
21 months ago by
United States
bioinforesearchquestions280 wrote:

Hi folks,

I would like to grab the population frequencies for a list of SNPs from 1000 genome project (https://www.ncbi.nlm.nih.gov/variation/tools/1000genomes/ ).

Currently, I am searching one snp at a time and downloading as a VCF. Is there a way for me to search my list and get the entire population data for my SNPs?

Do we have any commandline tool to handle this requirement?

1000genome snps rsids vcf • 1.9k views
ADD COMMENTlink modified 21 months ago by finswimmer13k • written 21 months ago by bioinforesearchquestions280
1
gravatar for chrchang523
21 months ago by
chrchang5237.1k
United States
chrchang5237.1k wrote:

The three boldfaced links at https://www.cog-genomics.org/plink/2.0/resources#1kg_phase3 provide one solution. With that dataset downloaded, plink2 can report allele frequencies for any predefined population, as well as any population you define yourself. (You can also export VCF files from it, and subsetting is frequently >10x faster than bcftools.)

ADD COMMENTlink modified 21 months ago • written 21 months ago by chrchang5237.1k
1
gravatar for sf21
21 months ago by
sf2110
New York
sf2110 wrote:

I would suggest downloading the sites VCF from here, http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ and using bcftools to subset it to SNPs of interest using the -R or -T flag.

Link to the sites file: http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz.tbi

ADD COMMENTlink modified 21 months ago • written 21 months ago by sf2110
1
gravatar for finswimmer
21 months ago by
finswimmer13k
Germany
finswimmer13k wrote:

Hello bioinforesearchquestions ,

if you are able to code use ensembl's REST-API, especially the variation endpoint. There also an endpoint for query multiple IDs at once.

fin swimmer

ADD COMMENTlink written 21 months ago by finswimmer13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1745 users visited in the last hour