Calculating The Depth/Level Of Shared Gene Ontology (Go) Terms Between Protein Pairs
Entering edit mode
12.1 years ago
Sally ▴ 60

I am currently working on a task that requires me to assess the shared GO terms between protein pairs. I have 30,000 protein pairs (PPI data) and I would like to determine the depth/level in the GO hierarchy that each protein pair shares. Could anyone please suggest how to proceed with this? Thanks a lot!

Example: 2 gene products: AFUA_5G07340 and AFUA_2G11040 Task: Find the shared GO terms between the protein pair and determine the level/depth in the GO hierarchy (where molecular function, cellular component, and biological process can be assigned level 3 in the hierarchy). I have to perform this task for 30,000 protein pairs.

I hope this question is clear. Please let me know if there are any doubts.

gene • 7.1k views
Entering edit mode
12.1 years ago
Fidel ★ 2.0k

The depth of the GO hierarchy is not a very efficient method to assess the similarity of two gene products based on the GO. Instead you may want to look at the so called 'similarity measures'. For example, some of these measures take into account the information content of the GO term that is the lowest common ancestor for a pair of GO terms.

I have used this method in the past to assess the reliability of protein-protein interactions.

You can get the similarity measure on-line using FunSimMat or IntelliGO.

Also, you may want to look at the following publications where they explain and propose solutions to the problem:

  1. Schlicker, A., Domingues, F. S., Rahnenf├╝hrer, J., & Lengauer, T. (2006). A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics, 7, 302.

  2. Resnik, P. (1999). Semantic Similarity in a Taxonomy : An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research, 11, 95-130.

  3. Lin, D. (1998). An Information-Theoretic Definition of Similarity. Proc. of the 15th International Conference on Machine Learning.

Entering edit mode
12.1 years ago

The problem with your approach is that GO hierarchy is set-up in a way where you can take multiple paths to reach the same term. Let's say a gene has the term, neuron differentiation. You can reach that term through neuron development or cell differentiation. Depending on the path, the depth could be different.

Like Fidel suggested, it might be better to look at closest common ancestor or use one of the packages shown.

You can download the GO SQL database and query it to get descendent/ancestor information:

I wrote blog entry recently about getting descendants/ancestors using the .obo flat file if you want to try that:

Entering edit mode
12.1 years ago

Try Cytoscape ( with GO plugin.


Login before adding your answer.

Traffic: 2806 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6