Question: Population Private SNPs
gravatar for ThePlaintiff
11 months ago by
Cape Town, South Africa
ThePlaintiff10 wrote:

I am working on 1000Genome data. I'd like to find for every population SNPs that are only found in a selected population (population private SNPs). Now, how I'd go about it is to recursively find the difference between sets of SNPs in different populations say for YRI and LWK, I'd get all the SNPs in YRI and filter out the SNPs that are shared between YRI and LWK. I'd repeat the exercise for the other populations. I tend to think that this kind of a functionality would have been implemented in one of the VCF analysis tools or genome analysis software if you know of a command or pipeline that implements this functionality please let me know. I could code up the solution but it'd save me a great deal of time if I could avoid redundancy.

snp next-gen • 356 views
ADD COMMENTlink written 11 months ago by ThePlaintiff10

Don't know about exiting tools. But we could get frequency per population, then use set operations to get SNP lists?

ADD REPLYlink written 11 months ago by zx87548.8k

^^ It does indeed seem to be as straight forward as how zx8754 describes. The allele frequency data can be used to infer alleles that are only present in one population group or another. If I was actively working on this, I would spend some time to get the 1000 Genomes data into a single BCF and also a PLINK dataset, where it would then be easier to work with it.

ADD REPLYlink written 11 months ago by Kevin Blighe53k

Thanks, I have one more question. I split the bed files by sub-population by running plink --bfile <MyFile.bed> --keep </path/to/sample/ids>. How do I test whether the allele frequencies are different across populations. I am considering 7 sub-populations of the 1000Genome data set for my analysis. I believe that I'll need to build a phenotype file for this. I am not clear on how to build the file and run it on plink. I would appreciate a format of the file and possibly plink commands.

ADD REPLYlink written 10 months ago by ThePlaintiff10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 616 users visited in the last hour