Extracting root annotation for each GO IDs
5
0
Entering edit mode
8.0 years ago
trisha ▴ 10

I have a list of gene ontology IDs (As GO:0016021, GO:0005515), and I would like to group them based on their root annotation to Biological process, Cellular component or Molecular function.
The desired result is GO:0016021 Cellular Component or CC
GO:0005515 Molecular Function or MF
Is there any easy way?

Gene Ontology • 2.9k views
ADD COMMENT
1
Entering edit mode
8.0 years ago

The Gene Ontology OBO file contain this information in the "name_space" field.

ADD COMMENT
1
Entering edit mode
8.0 years ago
ivivek_ngs ★ 5.2k

I think this is what you might be looking for this It seems that this mapper can actually take the GO terms and map it to the corresponding categories based on the GOslim categories. You can take a look at this or this (only restricted use for yeast though)

Added : I would like to add another thread to this answer from Pierre, he always amazes me with his works. Take a look at here how to use a bash script to do the work

ADD COMMENT
0
Entering edit mode
8.0 years ago
trisha ▴ 10

Would you please guide me how can I extract from there? Just some hints... many thanks

ADD COMMENT
1
Entering edit mode

The OBO file is just a text file. Look at its structure to get an idea on how to parse it. Basically, you locate your GO term and find the name_space field associated with it. The alternative would be to query the GO term MySQL database which you can also download and set up on a local MySQL instance.

ADD REPLY
0
Entering edit mode

Thanks... I used the MySQL and did the work.

ADD REPLY
0
Entering edit mode
8.0 years ago
LLTommy ★ 1.2k

Quite frankly, I think there is an easier way. This link takes you to the GO ontology in EBI's Ontology Lookup Service. You can just search for term ids, or browse through the tree view manually.

ADD COMMENT
0
Entering edit mode

Yes but since the OP has lots of terms manually doing it will not be a good choice here, instead as I have edited my answer about a thread which @Pierre Lindenbaum wrote with QuckGo from ebi should also be doing the trick.

ADD REPLY
0
Entering edit mode

Ok, you never said you have lots of terms. :) .... anyway, the OLS also offers an API, so you can get the information also via programming. However, if you already solved your problem, I guess you are fine.

ADD REPLY
0
Entering edit mode
8.0 years ago
Guangchuang Yu ★ 2.6k

You can use clusterProfiler:

> require(clusterProfiler)
Loading required package: clusterProfiler
Loading required package: DOSE
Loading required package: DBI

> go=c('GO:0016021', 'GO:0005515')
> go2term(go)
       go_id                           Term
1 GO:0005515                protein binding
2 GO:0016021 integral component of membrane
> go2ont(go)
       go_id Ontology
1 GO:0005515       MF
2 GO:0016021       CC
> go2ont(go) -> x
> with(x, split(go_id, Ontology))
$CC
[1] "GO:0016021"

$MF
[1] "GO:0005515"
ADD COMMENT

Login before adding your answer.

Traffic: 2266 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6