I am carrying out GO term enrichment on a set of up-regulated genes, and also on species-specific (OrthoMCL) gene clusters, and clade-specific gene clusters.
- I need to use a custom background of genes. For enrichment within the up-regulated genes can I simply take all the GO terms present in the genome as a whole and use this as the background (GO terms in up-regulated gene set vs. GO terms in genomic background)?
- If this is valid, presumably I can do the same for the species-specifc gene clusters (GO terms in species-specific clusters vs. GO terms in genomic background for that species).
- For the clade-speciifc gene clusters, I have three species in the clade of interest which were used in the OrthoMCL analysis. Can I take all the GO terms represented in all three genomes in the clade of interest and use this as the background (GO terms in clade specific gene-clusters vs. go terms in genomic background for all three species)?
Hope this is correct.
Related to this: is it usual that many genes in the genome will not have a go term attached to them in the functional annotation? (seems this is the case looking at the tomato genome)