I was wondering if anyone has suggestions for fairly simple GO term enrichment that would benefit a survivor/non-survivor study. Currently I have just been comparing the top n most represented terms for the top n most differentially expressed genes. E.g., In the survivors these are the top n terms, and in the non-survivors these are the top n terms, filtering for common terms. I've also ranked terms by the average change in expression, between up and down regulated genes.
Would it be sufficient to continue with the current approach, or would it be worth exploring other venues (if so what would be some reasonable approaches). I have played around with DAVID to a degree, but the goal behind introducing GO was to add coarse-grained support to pathway analysis. DAVID results would warrant a more in dept followup and I want to avoid project creep.
UPDATE: The arrays came from another study where the RNA used was collected only from experimental animals, not all can be assumed to have been exposed to the same experimental conditions. Animals considered non-suvivors here were those that met endpoint critera before the end of the study. Survivors were considered to be from those that had not yet reached endpoint criteria when the study period ended. The RNA was collected from liver tissue. This means that the samples are not time matched and there is no data for the control animals. The goal is to attempt to gain insight into the differences between two experimental classes.
Differentially expressed here would mean a gene that has significant (p <= 0.05) FC of at least 2.0 between the non-survivors and survivors. The arrays were processed with limma using background subtraction and normalized with VSN.
GO term assignment has been established by mining for orthologous proteins through reciprocal blast againt human and mouse proteins. Mouse and human GO terms were downloaded directly from the GOC website to obtain the most recent versions. Meaning a gene in the model animal will only have GO annotations if there is at least one gene in either humans or mice that has a sufficient RBH. This also limits the analysis to some subset of the available probes on the arrays. Human and mouse derived annotations are being considered individually, meaning I plan to duplicate any GO enrichment I perform, once with the mouse set and once with the human set.
All of this is in a MySQL db, so it is flexible.
The idea is to use these annotations to provide a basic functional characterization of the differences between the two experimental groups.
Can you describe your study design a bit more? You have a survivor/non-survivor study, but you talk about two sets of genes. I would have thought that two groups would give only one set of genes.
I added a better description of the situation.