Hi there, I am new to the metagenomic field. I recently did a blastx of my contigs using the nr database. With the results, I plan to categorize the protein hits with COG_ID and COG functions. I am using the cog2003-2014.csv database that's available from the NCBI website, by looking up the protein id's (from the blastx results) and the associated COG ID. However, I realize that almost all of the protein id's in my blast results are not found in the cog2003-2014.csv database. Am I doing something wrong here? I am aware that the cog database is much smaller than the nr database, but is there another way to find the COG IDs for my proteins? I have tried to use Uniprot's retrieve/ID mapping website (, however, it still doesn't give me the COG ID but rather the GO numbers. Is it possible then to use GO numbers to get to COG IDs? Thanks a lot!

