A Quick Way To Annotate Entire Bacterial Genomes To Go?
3
4
Entering edit mode
8.8 years ago
mgalactus ▴ 760

Is there any fast and reliable way to annotate protein sequences from bacterial genomes to the GO terms?

The best option so far seems to install and run a local copy of InterproScan, but its annotations go way beyond my needs and its computationally very expensive. Also, there doesn't seem to be a way to speed-up the analysis to obtain just the GO-terms. Maybe reducing the number of inspected DBs?

Any other tool/database that perform an annotation in reasonable time would be fine.

Thanks


Update: turned out that iprscan version 5 is less aggressive and a bit faster than previous versions: the annotation of ~10'000 proteins took a little more than one day.

Update 2: the new "mapper" feature of EggNOG (at the time of this edit still in beta) seems even better and way faster (code to have a stand-alone version is here).

gene-ontology bacteria genome annotation • 8.0k views
ADD COMMENT
0
Entering edit mode

Hi @mgalactus, did you finally go through InterproScan path? I'm in the same boat. I need GO terms for my bacterial proteins.

ADD REPLY
0
Entering edit mode

Hi, I actually think that the new beta feature of EggNOG could be way faster snd more useful: http://beta-eggnogdb.embl.de/#/app/seqmapper and code here: https://github.com/jhcepas/eggnog-mapper

ADD REPLY
2
Entering edit mode
8.8 years ago
Josh Herr 5.7k

This is my opinion, but I believe most people are moving away from using GO terms. If you'd just like to annotate your bacterial genome, I don't think you can do better than PROKKA right now, but there are numerous tools you can try. Prokka is fast and extremely accurate (based on my benchmarking) but I also use MAKER for large genomes -- see this question also: Automated Microbial Gene Prediction From Assembled Genomes -- What Is The Latest, Most Accurate Software?

This question has been asked here many many times, so here are some more links to help you:

Bacterial Annotation Pipeline

About Annotation Of Microbial Genome

Bacterial Annotation

How To Annotate A Newly Sequenced Genome

and many more...

ADD COMMENT
1
Entering edit mode

Hi, many thanks for this very detailed reply! Indeed, prokka is a really smooth and helpful tool for fast and accurate genome annotation. My problem however is that I need GO terms to highlight some relevant functional differences between gene sets. GO, with its functional hierarchy and related tools is the best tool to go. In the end, it turned out that iprscan v5 is not so aggressive as the older versions...

ADD REPLY
1
Entering edit mode

+1, No problem... Just wanted to find out why you needed GO terms? I didn't mean to discourage you, but feel like the scientific Kegg Id Vs Cog Id, And The Best Method For Large Batch Id Assignment?. In the end it's best to use something that is easiest for both you and the scientific community to understand.

ADD REPLY
1
Entering edit mode
8.8 years ago
Hranjeev ★ 1.5k

Have you tried BLAST2GO? It has a Java webstart (save the pain of installation) and uses BLAST which uses public server resources (NCBI Blast). And the mapping of the GO terms is relatively fast. A bacterial genome can complete in 2-3 days in my experience.

If you have the BLAST results in hand you can cut down the analysis time in half.

ADD COMMENT
0
Entering edit mode

That is still a bit too slow, as I need GO annotation for at least 10k protein sequences; the good news is that iprscan 5 is less aggressive than the old version.

ADD REPLY
1
Entering edit mode
8.8 years ago
nepgorkhey ▴ 120

I guess you can use JGI IMG platform to annotate your bacterial genomes by uploading your sequences in their database. You can still keep your sequences private if you don't want other to use it.

ADD COMMENT

Login before adding your answer.

Traffic: 2472 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6