Question: Expected Gene Ontology term frequency in a genome?
gravatar for kbrevik
3.2 years ago by
kbrevik0 wrote:

Hello! I hope this isn't too obvious of a question.

I'm looking for just some basic ballpark estimates of GO term frequency in some "average" genomes, or some benchmarks done with some current assemblies. For example, GO:0000049 occurs 12 times in assembly x, GO:0013232 occurs 2000 times, etc. Of course these numbers are going to be very different based on methods and data and all that, but rough estimates is what I am looking for.

I'm working with some resequenced genomes, and I am just aiming to confirm that my estimates are consistent to rule out some programmatic issues. Thanks!

ADD COMMENTlink modified 3.0 years ago by Biostar ♦♦ 20 • written 3.2 years ago by kbrevik0

This is interesting, but not available...? It's usually the other way around, i.e. Use the gene frequency estimates to know whether a GO category is significant or not in an experiment. In short, you may have to write your own program to know what you want to know, and your post made me recall this blog post so I hope this is a good lead for you.

ADD REPLYlink written 3.0 years ago by theobroma221.1k

I can see potential issues with this approach:
- one is to account for the hierarchical nature of the ontology and the way genes are annotated, e.g. the same gene in two different genomes may have been annotated using different related terms e.g. the parent in one case and a child term in the other.
- second is to account for the use of different versions of the ontology. While obsolete terms can be mapped to new ones, new terms have obviously not been used in older annotations.

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Jean-Karim Heriche22k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 699 users visited in the last hour