There have been varied levels of success in genome-wide association studies (GWAS).
What do you recommend as exemplar papers that have well-documented methods? Any other online resources to check out? Any research groups to keep an eye on?
The following articles were really useful for me to understand the concepts around GWAS.
I would recommend the following reviews to understand the concept and methods. Most of these reviews refers the major studies and specific details can be obtained from individual papers. But you can get an overall idea about the concept, statistical methods and expected results from a GWAS studies from these review articles.
An easy to ready review article that start with basic concepts and discuss future prospects of GWAS Genome-wide association studies and beyond.
A detailed introduction to basic concepts of GWAS from the perspective of vascular disease : Genome-wide Association Studies for Atherosclerotic Vascular Disease and Its Risk Factors
Great overview of the current state of GWAS studies: Genomewide Association Studies and Assessment of the Risk of Disease
Detailed overview of statistical methods : Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application
For a bioinformatics perspective Jason Moore et.al's review will be a good start : Bioinformatics Challenges for Genome-Wide Association Studies
Soumya Raychaudhuri's review provides overview of various approaches for interpretations of variants from GWAS Mapping rare and common causal alleles for complex human diseases.
For introductory material, the new blog Genomes Unzipped has a couple of great posts:
The recent paper from 23andMe, Web-Based, Participant-Driven Studies Yield Novel Genetic Associations for Common Traits, is a good example of a publication where methods were clearly explained.
http://www.genome.gov/gwastudies/ -- A Catalog of Published Genome-Wide Association Studies, as far as I can tell the best online database of GWAS results.
See also some of the resources mentioned in this previous thread.
I just saw this which seems to be a very good summary of the situation, and with links to informative discussions. With one of the punch-lines being:
This study demonstrates that combining studies using meta-analysis, achieving massive sample sizes to detect extremely small effects can result in both clinically and biologically meaningful discoveries using GWAS.
The papers that Khader mentions are ones I would also suggest you read. There was a recent paper on GWAS for personality traits which failed to show anything significant. This is also relevant because it shows one aspect of trying to take GWAS too far into the realm of culture and learned behavior. One review on this is here. Last week's article on Cloninger's Temperament scales is here.
Added in edit on 14 Oct 2011: The paper entitled "Pathways of distinction analysis: a new technique for multi-SNP analysis of GWAS data" is highly relevant.
Certainly two very important emerging trends in GWAS work are:[?] 1) assignment of function to SNPs with top association values. This has been demonstrated recently for SORT1. (added in edit on 6 Dec 2011) The 2011 papers by Suhre et al on GWAS of metabolite values from serum and urine often do well to point toward function, as in associations of substrate and product ratios to the gene encoding that enzyme or transporter.[?] 2) Looking at the environment - diet, sleep, exercise, alcohol consumption, tobacco smoking, altitude and latitude of residence (oxygen tension and seasonality/day length, respecitvely) in order to assess gene-environment interactions. This is now taking place for Framingham Heart Study and GOLDN. We're doing the GOLDN GWAS and are involved in FHS.
You may also have a look at the special issue published in the Lancet in 2005. Although it may seem a little bit old, it includes a very interesting review from David Clayton. Also, it is worth reading the method used by the Wellcome Trust Consortium, e.g.
The Wellcome Trust Case Control Consortium. (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661-678.
The 1000 Genomes project includes useful ressources as well, especially real data that you can use to better understand issues surrounding GWAS analyses. But you can also check the Bioconductor project (snpMatrix package and the like). There is also an interesting map of published GWAS on Genome.gov.
You might be interested in the ICSB special topics workshop "The successes, challenges and prospects for GWAS mega-analyses for complex diseases such as schizophrenia", which is on the 15. October 2010 in Edinburgh:
Especially the last session would be of interest to you:
Session 4: The road ahead
1) Introduction to novel computational paradigms (Eleazar Eskin)
2) Few samples with many features – lessons already learned from micro-array analyses (Geoff McLachlan)
3) Multi-instance learning and the benefit of sophisticated and customized machine learning approaches (Karsten Borgwardt)
Discussion: Future directions for GWAS analysis with customized machine learning approaches and data integration