"Exploring OpenSNP" is a project I did for an intro stats class. Although the bioinformatics component is a bit soft, It might be useful for beginners learning R, dplyr, ggplot, RMarkdown, and knitr. (and now it is Binder-enabled)
source code: https://github.com/leipzig/opensnp
Genome wide association studies are designed to look for genetic markers, typically single nucleotide polymorphisms, in microarray or sequencing data associated with phenotypes – diseases or other physical characteristics. OpenSNP (https://opensnp.org) is a community “crowdsourced” project in which ordinary people can submit genotypes obtained from commercial direct-to-consumer genetic testing providers such as 23andme, along with whatever phenotypic descriptors – both physical (eye color, height) and behavioral (disposition, preferences) – that they choose to reveal.
Can we validate published genome-wide association studies of common human phenotypes using the relatively small volunteered data publicly available in OpenSNP?