Question: It Is Wise To Transfer Kegg Annotations Between Orthologs?
6.8 years ago by
United Kingdom
we have a set of bacterial genomes for which we know the pangenome AND the KEGG annotation (the KO codes) for each protein. If two proteins are orthologs, is it correct to transfer the KO code from one protein to another that wasn't annotated by KEGG? According to one of the many definitions of orthologs ("two orthologs share the same function") it should be correct.


annotation kegg • 1.5k views
6.8 years ago by
This is a valid approach. It is already in used to propagate annotation for genes in GO

Thanks. What bothers me is the fact that one of the orthologs was not annotated by KEGG, which could mean that it hasn't the function of the other ortholog. But i guess that we can add an evidence flag like in GO or Uniprot...

You definitely should do that. In addition you may want to read some of the papers that have come out over the last two years or so from Marc Robinson-Rechavi's group and Martin Hahn's about the "ortholog conjecture." Hahn showed some potential problems but Robinson-Rechavi and other did some additional analyses and showed that it still held (orthologs more functionally similar than paralogs) but there are still some things to consider and keep in mind. That was mostly limited to Eukaryotes and in particular Metazoa where tissue-specificity and such can be an issue but worth familiarizing yourself with.

Thanks, I'll definitely read the papers you are suggesting and keep track of the "transferred" annotations

6.8 years ago by
United States
Yes. That's why we use the 'id mapping' tools/ finding orthologs's ids- but unfortunately-

  1. KEGG/ MetaCyc are NOT always complete.
  2. Missing Hits may NOT indicate "missing protein/ function" but a poorly annotated DB (not updated).
  3. Curation level of the DB are sometimes questionable as well.
  4. Genomes/ proteomes/ transcriptomes from users like (you/ me) may provide "gaps"/ "missing links" for pathways/ functionality - and are traceable- suppose A goes through B, C, D, E and in your organism, A goes to D and E steps directly, indicating "missing proteins/ functions" [unless those steps are biochemically redundant or are "spontaneous/ biogenic" in a biological system]


