How Would You Define If A Peptide Is Exposed At The Surface Of A Protein?
7
6
Entering edit mode
12.4 years ago

The results of an analysis provides me with a number of 15 amino acid peptides from bacterial proteins. I want to rapidly screen these for those that are most likely to be surface-exposed in the tertiary structure of the protein.

Do you know of a robust mechanism for achieving this? Is a measure of hydropathy the best I can do? Is there a rapid way, assuming a 3-dimensional model, of defining if a particular stretch of residues is surface-exposed? (preferably a non-visual, scriptable means)

protein structure analysis • 7.6k views
0
Entering edit mode

Do you have structures (PDB) for these proteins, or just sequences?

0
Entering edit mode

Sequences only. Was planning on using ModBase to pull homology models where possible (and where the PDB doesn't already provide). However, I do really require a sequence-first approach to this screen.

0
Entering edit mode

consider that it is different to calculate this on proteins in the cytoplasm and proteins in the ER/secretory pathway, because there is a big difference in the pH of these two compartments.

6
Entering edit mode
12.4 years ago

If you have structure, you may use NACCESS or PSA/TEM file output from JOY package to determine the accessible surface area of individual residues. If you only have sequence as suggested by bubaker you can get a rough estimate by calculating the percentage or frequency of amino acid in your peptide using the classification scheme (Surface, Neutral and Buried) that explains the exposure of residues in the peptides.

• Surface Residues : EDKNQR
• Neutral Residues : AGHPSTY
• Buried Residues : CFILMVW
6
Entering edit mode
12.4 years ago
Neilfws 49k

With structures (even good homology models), I'd recommend DSSP or STRIDE. From sequence, it is a difficult problem. Perhaps PHDacc, part of the PredictProtein suite, is one option - although that's a web server tool, I don't know how script-able it is.

I don't think that any predictions from sequence could be described as robust - certainly nowhere near the accuracy of, say, secondary structure prediction. Structural context is very important: for example, charged residues have higher propensity for exposure, but can just as easily be buried if they form pairs.

6
Entering edit mode
12.3 years ago
Ruchira ▴ 230

SABLE predicts solvent accessibility, secondary structure and transmembrane domains from sequence, and is available either as a web server or downloadable software. I've used the downloadable version. It gives you output like this:

SECTION_SA

Relative solvent accessibility prediction
0 -> fully Buried
9 -> fully Exposed
3rd line -> confidence level (scale from 0 to 9, corresponding to p=0.0 or low confidence and p=0.9 or high confidence, respectively)

>   1                                                              60
554303030456533120403353504400510063462525302010333415565115
445797748835446827696545584696888765325443495976545584455574
>  61                                                              79
DLDMEDNDIIEAHREQIGG
4151554110201221233
5545445568487445533
END_SECTION


Edit: The cool syntax coloring is due to BioStar, the SABLE output is just ASCII text. :-)

Also, if you don't have an existing structure, you could try looking for a homology model at ModBase. You may be able to use the linked ModWeb modelling server to build one if it isn't already in ModBase. Then you could use one of the structure-based methods on the model. Of course, you would have lower confidence in these predictions since they would depend on the quality of the model.

0
Entering edit mode

SABLE looks interesting. How is its performance ? How long it will take to calculate these features for a typical sequence of 100 residues ?

3
Entering edit mode
12.4 years ago
Tom Walsh ▴ 550

Jpred can predict residue solvent accessibility from sequence (full-length proteins, not peptides) and is 78-88% accurate depending on which % threshold you use to define buried vs exposed 1.

(I'm the sysadmin in the group that develops Jpred but not one of the authors).

0
Entering edit mode

This is interesting. I didn't know this before. What algorithm Jpred is using for solvent accessibility calculations ? What is the difference in prediction if input is a peptide (say length <20) or full length proteins ? Can you share the reference also for the implementation of Jpred with solvent accessibility calculation ? Thanks

0
Entering edit mode

Jpred uses MSAs of known structures as a training set for a neural net, which is used to do the predictions. Solvent accessibility assignments are done using DSSP.

The neural net uses a 17-residue sliding window so it really only works on full-length proteins.

The underlying algorithm (called Jnet) is described in this paper

HTML preprint of it here

Full list of refs here

2
Entering edit mode
12.4 years ago
Suk211 ★ 1.1k

I think without knowing the structure it will be difficult to carry out this prediction but I hope this webserver might be of some help:

HSEpred: predict half-sphere exposure from protein sequences

1
Entering edit mode
12.4 years ago
Hanif Khalak ★ 1.3k

I don't know of software you can download which does this, but one approach which I've seen is to align many sequences for which structure - and so accessibility (exposed/buried status) of each residue - is known and build a statistical model.

You might be able to try this using HMMer, since you can input alignments and do class prediction.

1
Entering edit mode
12.4 years ago

I do not know a method to predict exposed residues in a protein. During my Msc. time in the structural biology community, our approach were based on an elimination scheme. First removing surely buried residues, looking for transmembrane pieces, then predicting secondary structures and mapping electrostatics and related potentials over them. After these steps you can remove more than 80% of the candidates. We tried once to use the trained dataset of WHAT_CHECK to infer the position of a residue based on its neighborhood. At that time my computer skill were on its infancy, so our results were at most hilarious. But, you can try it too. David Jones and his threading things has similar trained datasets that you could use.