Question

Tools Of Choice For Functional Annotation Of Genes And Proteins

11

Entering edit mode

13.1 years ago

NPalopoli ▴ 290

I would like to know which are the tools you would choose for performing the functional annotation of genes and proteins. It would be very helpful if you could specify the capabilities, the pros and cons and your own experience relating any method you may recommend or not.

I am particularly looking for alternatives to the Blast2GO annotation pathway, which I already know is widely used.

annotation function • 13k views

ADD COMMENT • link updated 2.5 years ago by Ram 44k • written 13.1 years ago by NPalopoli ▴ 290

Ram · Answer 1 · 2011-08-19

I am using InterProScan for that purpose. The installation process isn't easy and you will need a computer cluster to launch it. But it pays off, as InterProScan integrates multiple databases: PROSITE, PRINTS, Pfam, ProDom, SMART, TIGRFAMs, PIR superfamily, SUPERFAMILY, Gene3D, PANTHER and HAMAP. This means, often you will get functional annotation from multiple, independent sources. In addition, another tools like SignalP can be optionally launched by InterPro.

What we normally do, beside InterPro, is phylogenetic reconstruction. We get phylogenetic tree for every protein from given species, then we predict one-to-one orthologs and transfer annotation from close model species. In this tree, you could transfer annotation from Phy000CVNF_YEAST|SHP1 onto orthologs from closely-related species like C. glabrata or other Saccharomyces. But high-throughput phylogenetic reconstruction is difficult and computationally extensive process. You can have a look at phylomeDB paper that discusses this matter in brief.

Concerning Blast2GO, it's easy-to-use approach. But it relies entirely on sequence similarity, so I recommend to be very careful with those predictions. I would rely on more specific methods like protein profiles (hmmer) or ideally phylogenetic trees. And transfer annotation only among one-to-one orthologs.

Ram · Answer 2 · 2011-08-18

As far as I know, semi-automatic annotation of proteins or genes exist by either sequence similarity or a similarity in the naming scheme, with varying degrees of success.

Sequence similarity based: These methods use sequence similarity to classify a gene/protein to either a GO term or EC term.

Blast2GO
EC-BLAST (closed beta; information)

Name similarity based: These are mostly used in computational modelling but the general approach is applicable to any name -> resource identifier mapping and tool therefore.

In general, have a look at MIRIAM (publication, Wikipedia), which provides a framework and guidelines to annotate entities in models but can be used in other contexts as well and identifiers.org, which essentially is the same service but the newer and broader approach.

Ram · Answer 3 · 2015-01-15

2

Entering edit mode

9.7 years ago

Yannick Wurm ★ 2.5k

Also have a look at Alexie Papanicolau's JAMP: Just Annotate my Proteins which features and intelligent HMM-based approach.

ADD COMMENT • link updated 2.5 years ago by Ram 44k • written 9.7 years ago by Yannick Wurm ★ 2.5k