Question

How Would You Define If A Peptide Is Exposed At The Surface Of A Protein?

6

Entering edit mode

13.9 years ago

Simon Cockell 7.4k

The results of an analysis provides me with a number of 15 amino acid peptides from bacterial proteins. I want to rapidly screen these for those that are most likely to be surface-exposed in the tertiary structure of the protein.

Do you know of a robust mechanism for achieving this? Is a measure of hydropathy the best I can do? Is there a rapid way, assuming a 3-dimensional model, of defining if a particular stretch of residues is surface-exposed? (preferably a non-visual, scriptable means)

protein structure analysis • 9.0k views

ADD COMMENT • link updated 13.9 years ago by Ruchira ▴ 230 • written 13.9 years ago by Simon Cockell 7.4k

0

Entering edit mode

Do you have structures (PDB) for these proteins, or just sequences?

ADD REPLY • link 13.9 years ago by Neilfws 49k

0

Entering edit mode

Sequences only. Was planning on using ModBase to pull homology models where possible (and where the PDB doesn't already provide). However, I do really require a sequence-first approach to this screen.

ADD REPLY • link updated 4.6 years ago by Ram 43k • written 13.9 years ago by Simon Cockell 7.4k

0

Entering edit mode

consider that it is different to calculate this on proteins in the cytoplasm and proteins in the ER/secretory pathway, because there is a big difference in the pH of these two compartments.

ADD REPLY • link 13.9 years ago by Giovanni M Dall'Olio 28k

score 6 · Answer 1 · 2010-05-28

If you have structure, you may use NACCESS or PSA/TEM file output from JOY package to determine the accessible surface area of individual residues. If you only have sequence as suggested by bubaker you can get a rough estimate by calculating the percentage or frequency of amino acid in your peptide using the classification scheme (Surface, Neutral and Buried) that explains the exposure of residues in the peptides.

Surface Residues : EDKNQR
Neutral Residues : AGHPSTY
Buried Residues : CFILMVW

score 6 · Answer 2 · 2010-05-28

With structures (even good homology models), I'd recommend DSSP or STRIDE. From sequence, it is a difficult problem. Perhaps PHDacc, part of the PredictProtein suite, is one option - although that's a web server tool, I don't know how script-able it is.

I don't think that any predictions from sequence could be described as robust - certainly nowhere near the accuracy of, say, secondary structure prediction. Structural context is very important: for example, charged residues have higher propensity for exposure, but can just as easily be buried if they form pairs.

Ram · Answer 3 · 2010-07-03

SABLE predicts solvent accessibility, secondary structure and transmembrane domains from sequence, and is available either as a web server or downloadable software. I've used the downloadable version. It gives you output like this:

SECTION_SA

Relative solvent accessibility prediction
0 -> fully Buried
9 -> fully Exposed
3rd line -> confidence level (scale from 0 to 9, corresponding to p=0.0 or low confidence and p=0.9 or high confidence, respectively)

>   1                                                              60
     PETHINLKVSDGSSEIFFKIKKTTPLRRLMEAFAKRQGKEMDSLRFLYDGIRIQADQTPE
     554303030456533120403353504400510063462525302010333415565115
     445797748835446827696545584696888765325443495976545584455574
>  61                                                              79
     DLDMEDNDIIEAHREQIGG
     4151554110201221233
     5545445568487445533
END_SECTION

Edit: The cool syntax coloring is due to BioStar, the SABLE output is just ASCII text. :-)

Also, if you don't have an existing structure, you could try looking for a homology model at ModBase. You may be able to use the linked ModWeb modelling server to build one if it isn't already in ModBase. Then you could use one of the structure-based methods on the model. Of course, you would have lower confidence in these predictions since they would depend on the quality of the model.

Ram · Answer 4 · 2010-05-29

3

Entering edit mode

13.9 years ago

Tom Walsh ▴ 550

Jpred can predict residue solvent accessibility from sequence (full-length proteins, not peptides) and is 78-88% accurate depending on which % threshold you use to define buried vs exposed 1.

(I'm the sysadmin in the group that develops Jpred but not one of the authors).

ADD COMMENT • link 13.9 years ago by Tom Walsh ▴ 550

0

Entering edit mode

This is interesting. I didn't know this before. What algorithm Jpred is using for solvent accessibility calculations ? What is the difference in prediction if input is a peptide (say length <20) or full length proteins ? Can you share the reference also for the implementation of Jpred with solvent accessibility calculation ? Thanks

ADD REPLY • link 13.9 years ago by Khader Shameer 18k

0

Entering edit mode

Jpred uses MSAs of known structures as a training set for a neural net, which is used to do the predictions. Solvent accessibility assignments are done using DSSP.

The neural net uses a 17-residue sliding window so it really only works on full-length proteins.

The underlying algorithm (called Jnet) is described in this paper

HTML preprint of it here

Full list of refs here

ADD REPLY • link updated 5.6 years ago by Ram 43k • written 13.9 years ago by Tom Walsh ▴ 550

Istvan Albert · Answer 5 · 2010-05-28

2

Entering edit mode

13.9 years ago

Suk211 ★ 1.1k

I think without knowing the structure it will be difficult to carry out this prediction but I hope this webserver might be of some help:

HSEpred: predict half-sphere exposure from protein sequences

ADD COMMENT • link updated 13.9 years ago by Istvan Albert 100k • written 13.9 years ago by Suk211 ★ 1.1k

score 1 · Answer 6 · 2010-05-28

I don't know of software you can download which does this, but one approach which I've seen is to align many sequences for which structure - and so accessibility (exposed/buried status) of each residue - is known and build a statistical model.

You might be able to try this using HMMer, since you can input alignments and do class prediction.

score 1 · Answer 7 · 2010-05-28

I do not know a method to predict exposed residues in a protein. During my Msc. time in the structural biology community, our approach were based on an elimination scheme. First removing surely buried residues, looking for transmembrane pieces, then predicting secondary structures and mapping electrostatics and related potentials over them. After these steps you can remove more than 80% of the candidates. We tried once to use the trained dataset of WHAT_CHECK to infer the position of a residue based on its neighborhood. At that time my computer skill were on its infancy, so our results were at most hilarious. But, you can try it too. David Jones and his threading things has similar trained datasets that you could use.