"Exploring OpenSNP" is a project I did for an intro stats class. Although the bioinformatics component is a bit soft, It might be useful for beginners learning R, dplyr, ggplot, RMarkdown, and knitr. (and now it is Binder-enabled)
Report: http://leipzig.github.io/opensnp/
Source code: https://github.com/leipzig/opensnp
Genome wide association studies are designed to look for genetic markers, typically single nucleotide polymorphisms, in microarray or sequencing data associated with phenotypes – diseases or other physical characteristics. OpenSNP (https://opensnp.org) is a community “crowdsourced” project in which ordinary people can submit genotypes obtained from commercial direct-to-consumer genetic testing providers such as 23andme, along with whatever phenotypic descriptors – both physical (eye color, height) and behavioral (disposition, preferences) – that they choose to reveal.
Can we validate published genome-wide association studies of common human phenotypes using the relatively small volunteered data publicly available in OpenSNP?
Nice work Jeremy!
And just in time for the R course I have to give next week :) This is a great example of what can be done with some common R packages and a great idea!