How To Annotate A Set Of Prokaryotic Genomes With Eggnog Orthologous Groups?
10.4 years ago
Hello, Im working on a metagenomic analysis where I need a reference set consisting of ~1500 prokaryotic genomes (in fasta format) from MetaHit and GenBank. My aim is to annotate the genes in this genomes with eggNOG orthologous groups.

From the eggNOG download site I think the dataset "Bacteria non-supervised orthologous groups (bactNOG) and their proteins" seems suitable for my annotation (flatfile).

The dataset looks like this:

## Protein name    start_position    end_position    orthologous_group    orthologous_group description
9685.ENSFCAP00000011039    1    363    meNOG04000    Leucine-Rich repeat protein SHOC-2
7159.AAEL014718-PA    5    527    meNOG04000    Leucine-Rich repeat protein SHOC-2
9606.ENSP00000352411    1    582    meNOG04000    Leucine-Rich repeat protein SHOC-2
7719.ENSCINP00000009123    9    531    meNOG04000    Leucine-Rich repeat protein SHOC-2

My question is: How I can perform this annotation, how I can annotate my set of prokaryotic genoms with this informations? Or I need also take into account the file "Protein sequences of all species, with the eggnog protein name."?

Thank you!

metagenomics orthologues
10.4 years ago
I imagine that this data already exists somewhere for the MetaHit sequences as the Bork group, who did the bioinformatic analysis of the MetaHit sequences, used eggNOG n their pipeline. see SmashCommunity: a metagenomic annotation and analysis tool and their website: You'll could use the same approach on the genbank sequences.

I suggest you contact Peer's group if you want to know where to access the MetaHit annotations and/or if you need help with using their pipeline.

Hope this helps Sarah

Thank you, this will help me a lot!


