Question: Biological Significance of Gene Modules/Computationally Detected Protein Complexes
Hi friends...

I am in early stage of my research in bioinformatics. I have applied different machine learning and data mining techniques to detect gene clusters using microarray gene expression data. Similarly I also know it is possible to detect protein complexes using protein protein interaction data with the help of machine learning/data mining algorithms.

I also know it is possible to find gene ontology terms and pathways to obtain biological significance of gene clusters / computationally predicted protein complexes.

1) It would be very helpful for me to get the list of other different types of meaningful biological significance analysis that can be performed on the gene clusters to reach into a meaningful conclusion.

2) What are different ways to predict about the roles of the genes in a gene cluster that they may have association with a particular disease?

Make a start by reading through this: Gene co-expression analysis for functional classification and gene–disease predictions.

Other metrics that can be used to infer importance of 'modules' (or 'culsters'), but again in the context of network analysis, include:

  • hub score
  • closeness centrality
  • vertex degree
  • betweenness centrailty

Also, look up GSVA: GSVA: gene set variation analysis for microarray and RNA-Seq data. This is implemented in R and there are tutorials to follow.

Finally, as you are just starting, please remember that in silico evidence will always be less convincing than in vitro evidence. In silico tools can help to guide the course of a study and generate new hypotheses, but the real acid test comes in the wet laboratory.


