Question

Modelling A Protein Using Modeller

7

Entering edit mode

12.9 years ago

Maria ▴ 70

Hi friends,

For modelling a protein how can i find the exact template for my target sequence? for eg: my target sequence is Q9BXM7 and using BLASTp I have got the template sequence 3DXN. So, now on what basis i will get to know the template which i have selected is the correct one. Kindly help as i am quite new to this field.

proteomics protein • 4.2k views

ADD COMMENT • link updated 11.3 years ago by Dollas Salleh ▴ 70 • written 12.9 years ago by Maria ▴ 70

score 9 · Answer 1 · 2011-06-02

You need to assess the alignment between your target and the proposed template. You can grab the sequences from both from the relevant databases Q9BXM7 and 3DXN. Use a program like water from the EMBOSS suite to perform a local pairwise alignment to look at the sequence conservation (I do a local alignment here, rather than global, because looking at the 2 sequences, it seems unlikely they align along their whole length).

#=======================================
#
# Aligned_sequences: 2
# 1: PINK1_HUMAN
# 2: SEQUENCE
# Matrix: EBLOSUM62
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 282
# Identity:      82/282 (29.1%)
# Similarity:   114/282 (40.4%)
# Gaps:          62/282 (22.0%)
# Score: 195.0
# 
#
#=======================================

PINK1_HUMAN      250 AGEYGAVTYRKSKRGPKQLAPHPNIIRVLRAFTSSVPLLPGALVDYPDVL    299
                     :|.||.|...|.|     |......|::::..:.:.....|||:|...||
SEQUENCE          31 SGAYGEVLLCKDK-----LTGAERAIKIIKKSSVTTTSNSGALLDEVAVL     75

PINK1_HUMAN      300 PSRLHP------EGLGHGRTLFLVMKNY-------PCTLRQYLCVNTPSP    336
                     ....||      |.....|..:|||:.|       ...|||..     |.
SEQUENCE          76 KQLDHPNIMKLYEFFEDKRNYYLVMEVYRGGELFDEIILRQKF-----SE    120

PINK1_HUMAN      337 RLAAMMLLQLLEGVDHLVQQGIAHRDLKSDNILVELDPDGCPWLVIADFG    386
                     ..||:::.|:|.|..:|.:..|.|||||.:|:|:| .......:.|.|||
SEQUENCE         121 VDAAVIMKQVLSGTTYLHKHNIVHRDLKPENLLLE-SKSRDALIKIVDFG    169

PINK1_HUMAN      387 CCLADESIGLQLPFSSWYVDRGGNGCLMAPEVSTARPGPRAVIDYSKADA    436
                     .. |...:|.::.      :|.|....:||||.      |...| .|.|.
SEQUENCE         170 LS-AHFEVGGKMK------ERLGTAYYIAPEVL------RKKYD-EKCDV    205

PINK1_HUMAN      437 WAVGAIAYEIFGLVNPFYGQGKAHLESR------SYQEAQLPALPESVPP    480
                     |:.|.|.|.:.....||.||....:..|      |:.           ||
SEQUENCE         206 WSCGVILYILLCGYPPFGGQTDQEILKRVEKGKFSFD-----------PP    244

PINK1_HUMAN      481 D-------VRQLVRALLQREASKRPSARVAAN    505
                     |       .:|||:.:|..|.|||.||..|.|
SEQUENCE         245 DWTQVSDEAKQLVKLMLTYEPSKRISAEEALN    276

With 29.1% identity, you are getting down towards the "twilight zone", but it is not unreasonable to suggest this would be a workable template for Modeller.

A better approach would be to use multiple templates as the input for Modeller, which is quite capable of operating in this fashion (have a look at the advanced example in the Modeller documentation). A BLASTP of PINK1 vs the PDB reveals a number of hits of a similar quality in the same region of the protein, you could use a number of these as templates for your modelling to improve your results.

score 4 · Answer 2 · 2011-06-02

Few things need to be considered while selecting your templates. And the selection completely depends on the aim of the project. If you are planning to model full length sequence of your(target) sequence, sequence identity and sequence similarity as well as query coverage matter a lot. On the other hand, if you want to look only the catalytically/functionally important sites/regions, templates with lower sequence identity might also help a lot because it has been shown that in most of the cases functionally important residues are conserved among related proteins.

Of course you can also use multiple templates if they have similar folds or belong to the same family. Also, the first hit provided by blast search might not always be the most appropriate template. Because sometimes, such templates although contain higher sequence identy, may correspond only part of the query sequence (sequence you want to model). So look at the query coverage also along with sequence identity.

Okay in your case, you could use the first template if you do not want to use multiple templates. However, I would recommend you to go through the UniProt entry for this protein sequence (http://www.uniprot.org/uniprot/q9bxm7) in detail before you proceed with modeling. In fact, UniProt provides wealth of information regarding protein sequences and you might get some of your answers from there. You can also find homology based models for this protein already been deposited. And if you are satisfied, you can use the models available there.

I hope this helps a bit in your understanding. And tell me if you are confused with anything written by me or do not understand some points.

Ram · Answer 3 · 2011-06-07

I would first check the homology modeling resources for a cursory view to see the available templates and the possible structure quality using the available templates before starting the modeling steps.

For example, the following resources provide pre-computed homology models (or you can generate on-the-fly) using available PDB templates.

ModBase

Swiss-Model Repository

Protein Model Portal: Provides various templates, you can chose template and proceed for automated modeling

Since you are interested in Modeller, I will go with results in ModBase, which use MODELER as the package for modeling. Here you need to look for the following parameters

Sequence identity

Query vs. Target coverage

Quality scores (See Model Information section of MobBase page)

score 0 · Answer 4 · 2013-01-02

0

Entering edit mode

11.3 years ago

Dollas Salleh ▴ 70

Hi. Just want to ask. In my cases Im planning to look on catalytic/functionally residue for docking purpose. After blast with PDB. Here, the my target template. 1) A ( id : 28%, qq : 87%, e value : 2e-16) 2) B (id : 26%, qq : 94 %, e v : 2e-18) 3) C (id : 36 %, qq : 36%, e vlue : 1e-06). All the template from different organism and structural information. Based on your oppinion what template should I choosed A,B,C?

Thanks.

ADD COMMENT • link 11.3 years ago by Dollas Salleh ▴ 70

0

Entering edit mode

Please create a new question

ADD REPLY • link 11.3 years ago by Hranjeev ★ 1.5k