Question: How to create a simulated SNP genotypes dataset
gravatar for fairweath89
3.5 years ago by
United Kingdom
fairweath890 wrote:

I am trying to create a simulated SNP dataset, consisting of ~40 individuals from two populations, and data from around 1000 co-dominant SNP markers, exhibiting a predefined (low) level of structure. The result I am looking for is one similar to that given by genalex's create function, but the crucial difference is that I require a dataset comprising 4 SNP alleles overall (A,C,G,T), but only two alleles per SNP marker (A/T, C/G, C/T etc.). All combinations present, basically like you would expect from a typical population genetic SNP dataset. 

Essentially, I guess I am after a function that allows me to dictate parameters such as level of structure, no. of individuals, number of alleles overall and per SNP, and that delivers a SNP dataset that roughly adheres to those parameters. An R solution would be preferable as it is the only language I am comfortable with for the moment. 

snp R • 1.6k views
ADD COMMENTlink modified 3.5 years ago by andrew.j.skelton735.7k • written 3.5 years ago by fairweath890
gravatar for andrew.j.skelton73
3.5 years ago by
andrew.j.skelton735.7k wrote:

Plink is by far your best bet for this - 

ADD COMMENTlink written 3.5 years ago by andrew.j.skelton735.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1388 users visited in the last hour