Question

Kegg Id Vs Cog Id, And The Best Method For Large Batch Id Assignment?

0

Entering edit mode

10.9 years ago

JacobS ▴ 980

I have a large RNA-Seq dataset with reads of about 200bp long. I've already used other tools to annotate these reads with gi numbers, but now want to associate them with KEGG/COG IDs for pathway analysis. Can someone please help me understand the difference (and different uses) for KEGG and COG IDs and what is the best way for large batch (on the scale of millions) annotation?

Thanks!

kegg annotation pathway • 10k views

ADD COMMENT • link updated 2.8 years ago by Arsenal ▴ 160 • written 10.9 years ago by JacobS ▴ 980

Ram · Answer 1 · 2013-05-29

4

Entering edit mode

10.9 years ago

Neilfws 49k

Not a complete answer to your questions, but with regard to understanding the differences:

COG was a NCBI project to classify proteins from sequenced genomes. It is no longer maintained and you should probably not use it. If you need to know more:

KEGG is an altogether larger, actively-maintained project. You might think of it as an attempt to create a systems biology database. They use their own annotation and clustering pipeline to assign IDs, called the KEGG Orthology system. Here's a key KEGG publication.

Many software tools have been built around KEGG, for example in R/Bioconductor.

ADD COMMENT • link 10.9 years ago by Neilfws 49k

0

Entering edit mode

Great, very helpful post!

ADD REPLY • link 10.9 years ago by JacobS ▴ 980

0

Entering edit mode

@ Neilfws, I see this post is 2-year-old. I was wondering could you update with new perspectives. I was going through KEGG Vs COG, I am finding this publication PMID-25428365. So now, is it better to use KEGG or COG? Thanks.

ADD REPLY • link updated 4.5 years ago by Ram 43k • written 8.8 years ago by swapnil.doijad • 0

0

Entering edit mode

Just to add an update for anyone who happens to find this topic, there was a COG update (2021):

Galperin, Michael Y., Yuri I. Wolf, Kira S. Makarova, Roberto Vera Alvarez, David Landsman, and Eugene V. Koonin. "COG database update: focus on microbial diversity, model organisms, and widespread pathogens." Nucleic Acids Research 49, no. D1 (2021): D274-D281.

ADD REPLY • link 2.8 years ago by Arsenal ▴ 160