Question: How to extract SNPs from vcf file based on Population
gravatar for aadhirareddy1323
13 months ago by
aadhirareddy132320 wrote:

Dear Friends,

My vcf file has SNPs available for different population(Africa, America, Europe,East Asia and South Asia ). I want to extract the data for Europe and East Asia together . Kindly let me know the possible ways.

Thanks in Advance

linux 1000genomes vcf • 1.0k views
ADD COMMENTlink modified 9 weeks ago by zx87546.5k • written 13 months ago by aadhirareddy132320
gravatar for Nandini
13 months ago by
Nandini760 wrote:

You can do this easily using vcftools, GATK tools, plinkseq etc.

you first have to generate a text file with the list of samples that form the population of your choice, let's say "population_of_interest.txt" Then,

vcf-subset -e -c population_of_interest.txt input.vcf > output.vcf


vcftools --vcf input.vcf --keep population_of_interest.txt  --recode > output.vcf
ADD COMMENTlink written 13 months ago by Nandini760

Thanks a ton Nandini ... it works :)

ADD REPLYlink written 13 months ago by aadhirareddy132320

This code works fine when i run for one chromosome at a time. But, I want to extract SNPs for all chromosomes together ,please let me know if ithere is any other option ?

ADD REPLYlink written 19 days ago by padakanti.sridevi0

It should work for all chromosomes. Does your vcf input file have all chromosomes ?

ADD REPLYlink written 9 days ago by Nandini760
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1903 users visited in the last hour