Which Is The Best Way To Do Protein Function Prediction
6
4
Entering edit mode
10.4 years ago
mtyler.jason ▴ 120

Hi all,

I have some proteins for which I want to do the function prediction. Which could be the best practices to do the prediction? I can use BLAST and other tools but I want better predictions?

• 9.4k views
ADD COMMENT
7
Entering edit mode
10.4 years ago
jackuser1979 ▴ 890
  1. Similarity search: First start with Blastp, if your sequence is less than 40% identity go for PSI-Blast

2.Domain search: Do domain search using Interproscan, Pfam or CDART

3.Search for siganl peptide and TM: search for signal peptide using signap and TM using TMHMM, phobius

4.Comparative modelling: Do homology modelling using swiss model, if your sequence less than 40% identity from blast result go for abintio modelling using I-Tasser

5.Gene ontology classifcation: You can search sequence for GO classifcation using blast2go or STRAP

6.Functional association prediction: Try searching sequence using STRING search

For further reading: Predicting protein function from sequence and structure -David Lee, Oliver Redfern & Christine Orengo.Nature Reviews Molecular Cell Biology 8, 995-1005 (December 2007) doi:10.1038/nrm2281

ADD COMMENT
1
Entering edit mode

Why would you do comparative modelling if you are only interested in the function? The homology search done before should be enough to realize what kind of protein you have and what function it might perform. Additionally, ab initio modelling is close to useless for proteins.

ADD REPLY
0
Entering edit mode

This is not hard fast rule to follow these steps, I think it always better get to structure-function relationship for our interested protein. Let say if we have hypothetical protein, structure prediction always helps in some way to know something about that protein.

ADD REPLY
0
Entering edit mode

I understand, just questioning the inclusion of homology modelling / ab initio modelling. Structure prediction only helps if it is reliable. Ab initio modelling is often not reliable, therefore, it will not help. Likely, it will bias the researcher in a very wrong way..

ADD REPLY
4
Entering edit mode
10.4 years ago
Leszek 4.2k

Give InterPro Scan a try. It's all-in-one solution that will classify your proteins into families, predict protein domains, annotate putative functions and GO terms.
You can install it locally if you need to annotate entire genome. More info here.

ADD COMMENT
0
Entering edit mode

Please do not use that InterProScan link, it is a testing/development interface only. The public usage InterProScan interface is part of the InterPro website: http://www.ebi.ac.uk/interpro/

EMBL-EBI provide Web Services for InterProScan (REST or SOAP), which can be used if you do not have the compute resources to run InterProScan locally.

ADD REPLY
0
Entering edit mode

ok, changed that

ADD REPLY
0
Entering edit mode
10.4 years ago

If you are working with Protein sequences, I think Pfam is a pretty commonly used tool to predict functional domains:

http://pfam.sanger.ac.uk/

ADD COMMENT
0
Entering edit mode
9.4 years ago

Blast2GO wins. I'm currently working on this problem. Workflow is really sleek in Blast2GO.

ADD COMMENT
0
Entering edit mode
9.4 years ago
Siva ★ 1.9k

You can also try the powerful Profile HMM - Profile HMM search implemented in HHPred. This is especially useful if your sequences have low sequence similarity to known proteins. You can search against several precomputed HMM databases including most protein domain databases, PDB and even COGs.

ADD COMMENT
0
Entering edit mode
9.4 years ago

Sequence similarity can only get you so far. If all you're after is molecular function e.g. enzymatic activity, sequence analysis is definitely where you should start. If by function you mean biological process the protein is involved in, then you will probably need more than sequence analysis. If you're lucky, your protein has a well characterized ortholog in an other organism so using good orthology resources (e.g. Treefam) will help. Otherwise, you can use a gene function prediction (a.k.a gene prioritization) tool such as funl or one of the many tools listed here. The basic idea is simple: given a query composed of some genes, rank the rest of the genome by some measure of functional similarity to the query. So you could use your protein as query and see what are the most similar (i.e. functionally related) genes. Many of these tools have been designed with disease gene prioritization in mind but some are suitable for measuring functional similarity, just make sure you understand what data they use and how.

ADD COMMENT

Login before adding your answer.

Traffic: 1913 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6