Interproscan GO terms results explanation
1
0
Entering edit mode
6.7 years ago
konstantinkul ▴ 110

Hi all,

I am new in InterPro scan as well as in genome annotation. I need to classify genes by GO terms to obtain a general table "GO Terms Classification Count Results" Could you please explain me one thing. When I run Interproscan with '-goterms' option it generated for me such output where several GO terms appear? So I need parsing this string to get only one or I need get all of them for further analysis (for instance https://www.animalgenome.org/bioinfo/tools/catego/) ? And what exactly means if several GOs appear?

H31_00826   08305e1960d087dfd44f6179ca32d6b9    88  SMART   SM01387     7   87  1.3E-47 T   21-08-2017  IPR000589   Ribosomal protein S15   GO:0003735|GO:0005622|GO:0005840|GO:0006412

Thank you in advance!

genome gene • 3.8k views
ADD COMMENT
2
Entering edit mode
6.7 years ago

A given protein may be annotated with multiple terms from the same or multiple domains. In your example, the protein is annotated as being involved in translation (GO:0006412, biological process domain) and as being part of the ribosome (GO:0005840, cellular component domain). So if you want to understand your protein, you need to parse all these terms. Sometimes a protein may be annotated with both a term and one or more of its children (an ontology is a directed acyclic graph), in which case you may want to keep one or more depending on the level of specificity you need.

ADD COMMENT
0
Entering edit mode

Ok, I got it! Thanks a lot!

ADD REPLY
0
Entering edit mode

If this answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted. Upvote|Bookmark|Accept

ADD REPLY

Login before adding your answer.

Traffic: 2335 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6