8.4 years ago by
Unfortunately, there is very little help or documentation available for the COG database. We are reduced to educated guess-work.
Taking the top row, COG J, I'd guess that:
- % in genus = percentage of proteins from Acaryochloris that are COG J
- % in Cyanobacteria = percentage of proteins from phylum Cyanobacteria that are COG J
- % in Bacteria = percentage of proteins from kingdom Bacteria that are COG J
The first 2 columns are less obvious. I'd guess that "% in sequence" might be based on a sum of sequence lengths (coding?) and "% in genome" is percentage of proteins from that genome, but it is not clear at all.
Having said all that: I would not use COG - it is a very old database and is no longer maintained by the NCBI. You can get similar information from KEGG or the IMG (Integrated Microbial Genomes).