I'm doing an analysis that requires the stratification of pathways from the C2/C5/C7 set of the molecular signatures database (http://www.broadinstitute.org/gsea/msigdb) to one of the following pathways: regulation, signaling, or metabolic. My current approach is to use the pathway names (they are from biocarta, KEGG, reactome, and PID,etc) to assign to signaling pathways as those that contain the strings "SIGNAL" "ERK","MAPK"; metabolic pathways as those that contain "METABOL", "CATABOL", "GLYCO","PENTOSE"; and regulatory pathways as those that contain "REGULAT", "UPREG", "ACTIV", or "INHIBI". Of course this is a "first-order" approximation!
Does anyone have an example of a method (pubmed id / with provided R code) that uses enrichment of GO terms to assign a gene set to one (the dominant one) of these types of pathways?
Thanks!