Question: Population Private SNPs
1
gravatar for ThePlaintiff
19 months ago by
ThePlaintiff30
Cape Town, South Africa
ThePlaintiff30 wrote:

I am working on 1000Genome data. I'd like to find for every population SNPs that are only found in a selected population (population private SNPs). Now, how I'd go about it is to recursively find the difference between sets of SNPs in different populations say for YRI and LWK, I'd get all the SNPs in YRI and filter out the SNPs that are shared between YRI and LWK. I'd repeat the exercise for the other populations. I tend to think that this kind of a functionality would have been implemented in one of the VCF analysis tools or genome analysis software if you know of a command or pipeline that implements this functionality please let me know. I could code up the solution but it'd save me a great deal of time if I could avoid redundancy.

snp next-gen • 742 views
ADD COMMENTlink modified 6 months ago by dawson.white10 • written 19 months ago by ThePlaintiff30
1

Don't know about exiting tools. But we could get frequency per population, then use set operations to get SNP lists?

ADD REPLYlink written 19 months ago by zx87549.6k

^^ It does indeed seem to be as straight forward as how zx8754 describes. The allele frequency data can be used to infer alleles that are only present in one population group or another. If I was actively working on this, I would spend some time to get the 1000 Genomes data into a single BCF and also a PLINK dataset, where it would then be easier to work with it.

ADD REPLYlink written 19 months ago by Kevin Blighe65k

Thanks, I have one more question. I split the bed files by sub-population by running plink --bfile <MyFile.bed> --keep </path/to/sample/ids>. How do I test whether the allele frequencies are different across populations. I am considering 7 sub-populations of the 1000Genome data set for my analysis. I believe that I'll need to build a phenotype file for this. I am not clear on how to build the file and run it on plink. I would appreciate a format of the file and possibly plink commands.

ADD REPLYlink written 19 months ago by ThePlaintiff30
1
gravatar for dawson.white
6 months ago by
dawson.white10
dawson.white10 wrote:

vcf-contrast is designed to do this. http://vcftools.sourceforge.net/perl_module.html#vcf-contrast

ADD COMMENTlink written 6 months ago by dawson.white10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 908 users visited in the last hour