How to download VCF of 1000 genome project with population frequencies?
3
0
Entering edit mode
4.1 years ago

Hi folks,

I would like to grab the population frequencies for a list of SNPs from 1000 genome project (https://www.ncbi.nlm.nih.gov/variation/tools/1000genomes/ ).

Currently, I am searching one snp at a time and downloading as a VCF. Is there a way for me to search my list and get the entire population data for my SNPs?

Do we have any commandline tool to handle this requirement?

SNPs 1000Genome VCF rsids • 6.9k views
ADD COMMENT
1
Entering edit mode
4.1 years ago

The three boldfaced links at https://www.cog-genomics.org/plink/2.0/resources#1kg_phase3 provide one solution. With that dataset downloaded, plink2 can report allele frequencies for any predefined population, as well as any population you define yourself. (You can also export VCF files from it, and subsetting is frequently >10x faster than bcftools.)

ADD COMMENT
1
0
Entering edit mode

Are these files contain the SNPs of the populations in the entire genome? I would like to compare allele frequencies of the North European population vs the African population, is it possible?

ADD REPLY
1
Entering edit mode
4.1 years ago

Hello bioinforesearchquestions ,

if you are able to code use ensembl's REST-API, especially the variation endpoint. There also an endpoint for query multiple IDs at once.

fin swimmer

ADD COMMENT

Login before adding your answer.

Traffic: 2244 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6