Obtain COGs from the 2020 database for vibrio cholerae
0
0
Entering edit mode
3.3 years ago
Merebellum • 0

Hey, I wish to sort the genes/ proteins from my genome of interest (vibrio cholerae) into categories. One way is by using COGs. Eggnog is very nice but I wish to sort the entire genome.

I am working with a similar genome that is uploaded into the 2020 database. Is there a way to obtain/download the COG list for with ID and categories for the entire genome from eggnog or the NCBI server?

I found an example of what I would like to end up with but it is not my genome. this contains the COG id, categories and the genes. If I could find the genome for vibrio cholerae that would be incredible. https://ftp.ncbi.nih.gov/pub/wolf/COGs/COG0303/listcogs.txt

ex. 61 ||||||||--|||-|-|-|||||||||---|||||||||-||-|||||||||||||--||------ 48 H HemL COG0001 Glutamate-1-semialdehyde aminotransferase

There is another link that has sorted 678 proteins of the V.cholerae genome, but I need the entire genome. I checked some of the genes that weren't sorted in to clusters by looking at the uniprot and eggnog sequences manually and they do have COGs. https://www.research.cs.rutgers.edu/~seabee/cog/Vch.html

ex. VCA0906 [NT] COG0840 (578) Methyl-accepting chemotaxis protein

There should be a way to get the entire genome as it is already sorted. I looked through the FPT NCBI site and was able to sort out the V.cholerae (Vch) COG id along with the gene name. However, I'm not sure how to get the corresponding categories. https://ftp.ncbi.nih.gov/pub/COG/COG2020/data/

ex.

VC0626 COG0001

VC2644 COG0002

VC0067 COG0006

How can I get the COG category and id for the entire genome?

COG sequence • 957 views
ADD COMMENT
0
Entering edit mode

It appears that the COG database has been updated in 2020. Unfortunately I can't get the search to work at NCBI. You may want to write to the web contact and let them know.

ADD REPLY
0
Entering edit mode

Hey GenoMax!

Thank you for the reply. I just sent them an email asking about the database. Do you know of an alternative? I would love to actually be able to go through the entire process of blasting my genome to the database and retrieving the COG output for my genome. This is beyond my skill and I cannot figure it out. Ultimately I just need a pie chart of the gene/protein functions for the genome and my list of 300 genes candidates. That's why the genome in the 2020 database will do just fine.

ADD REPLY

Login before adding your answer.

Traffic: 1804 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6