Hiding/Merging Child Annotations Terms Under The Parents [Gene Ontology]
2
3
Entering edit mode
12.2 years ago

Hi, I have a list of Go Ids and the respective over-represented annotations, but most of them are the the child or sub-divisions of a main/parent term. How to hide them or may be statistically merge then under the main/parent category.

Example Set:

GO    Term
GO:0006351    transcription, DNA-dependent
GO:0032774    RNA biosynthetic process
GO:0016070    RNA metabolic process
GO:0019222    regulation of metabolic process
GO:0050794    regulation of cellular process
GO:0050789    regulation of biological process
GO:0065007    biological regulation
GO:0048522    positive regulation of cellular process
GO:0031323    regulation of cellular metabolic process
GO:0090304    nucleic acid metabolic process
GO:0080090    regulation of primary metabolic process
GO:0060255    regulation of macromolecule metabolic process
GO:0006139    nucleobase-containing compound metabolic process
GO:0048518    positive regulation of biological process

So, the last terms like positive regulation of cellular process , positive regulation of biological process can go under the broad terms like regulation of biological process and regulation of biological process.

Can suggest some tool which can do it textually or graphically.

Cheers

P.S. Revigo can do it, but something else which can be accessed from terminal or R

chip-seq go gene-ontology • 6.4k views
ADD COMMENT
0
Entering edit mode

I would like to know why you want to do that. In general I think the opposite approach is more useful. In that case you would calculate the significant child terms first, remove (prune) them from the tree and then calculate whether the parent term is still significant. We actually have a paper on that, see: http://dx.doi.org/10.1093/bioinformatics/bts366 . Merging everything in the parent terms often leads to conclusions like: "we did a diet study and found that 'metabolism' was affected". Sigh...

ADD REPLY
0
Entering edit mode

Chris, I will read your paper, looks promising. I acknowledge your point, I am practicing gene ontology and had a notion, that only parent terms are important and we are not mostly interested in childs. Just think of the case as "Regulation of Biological processes" followed by "Positive Regulation of Biological processes" and "Negative Regulation of Biological processes", in that case, one would like to see just the parent, isn't it.

Thanks

Cheers

ADD REPLY
0
Entering edit mode

The problem with doing that is how far up the ancestor tree do you go? ReviGo has a nice implementation to get semantic relevance out of the GO structure. I was planning to try to replicate their algorithm in python but I just can't find the time. Another alternative is to maybe use GO slim annotations instead. But I often find that too be too vague.

ADD REPLY
0
Entering edit mode

Hey, I assume one should go up to the main parent term in the tree and then jumps to the next tree. The best way now I think is to cherry-pick the terms one want to see from the basket of highly significant terms and represent them either visually or textually.

ADD REPLY
2
Entering edit mode
12.2 years ago
Joachim ★ 2.9k

You can follow the really detailed instructions given in this blog post by Damian Kao (Dk over here). He explains how to read the Gene Ontology's OBO flat-file into Python data structures and then access parent/child relationships.

Hope that helps.

ADD COMMENT
1
Entering edit mode

This sounds like the way to go.

ADD REPLY
0
Entering edit mode

Thanks Joachim, this means I have either to download full OBO flat file, collapse it by child and then run a GO analysis on it or I have to find a way just to convert the current GO file with the list of over-represented terms to OBO flat file and push through the python adapted code.

ADD REPLY
0
Entering edit mode
12.2 years ago

I don't know of a command line tool or an R package that can do this.

When I've had to map child terms upward into their parents in an ontology -- which I think is what you are asking to do (I don't know what you mean by "statistically merge" terms) -- I've had some success with the Ontology Lookup Service. If you enter in your term name, and then hit "browse" it takes you to a page that maps the child term within parents.

Amigo gives you something similar too if you click "view in tree"

You can then add a column to your data table called "parent term" and then your merger is done.

I don't know what format you require your data to end up in however.

ADD COMMENT
0
Entering edit mode

Thanks Alex, How one should proceed with a file containing the list of GO ids, as manually entering single terms makes it impractical. Statistically merging I was referring to as, if a child has high log p-value and then comes the parent with little less logP value. So, if we just take the parent, statistically as child had high logP, the parent's logP should be increased a bit, and that bit one has to determine statistically. The format can be anything b/w a simple textfile to a plot etc.

ADD REPLY
0
Entering edit mode

I thought you could export an XML file from OLS searches -- but it appears not. What they do have is an SQLdump which you can search and map child/parent terms on your local machine. http://www.ebi.ac.uk/ontology-lookup/databaseExport.do

I agree -- No way would I want to do a large number of single terms manually. Thanks for clarifying statistical merging for me.

ADD REPLY

Login before adding your answer.

Traffic: 1376 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6