Hunting For A Protein Based On Its Domain Architecture
3
2
Entering edit mode
13.2 years ago
jvijai ★ 1.2k

Hi,
(Disclaimer: I am a novice in protein structure informatics)

I am trying to find proteins with all of the following features in any annotated protein database. It has:
1. One ankryin repeat region,
2. One leucine-rich repeat (LRR) domain,
3. One kinase domain,
4. One DFG-like motif , 5. One RAS domain,
6. One GTPase domain,
7. One MLK-like domain,
and 8. a WD40 domain

Ankryin-LRR-Kinase-DFGlike-RAS-GTPase-MLKlike-WD40

Where do I start?
What are the logical steps to zero in one possible candidates?

Thanks
:)

protein database • 2.6k views
ADD COMMENT
0
Entering edit mode

Thanks everyone. Will try out these methods.

ADD REPLY
6
Entering edit mode
13.2 years ago

You can use the Architecture analysis functionality of SMART for posing this kind of questions. However, it does not know of any proteins that match your long list of requirements.

Unrelated to that, requirements 5 and 6 would seem redundant, since the RAS domain is a GTPase domain.

ADD COMMENT
2
Entering edit mode
13.2 years ago

I am not aware of any tool that does this in one step. Maybe there is one that I'm not aware of (CLCbio?) and other users can provide input for.

Otherwise, I would say your best bet is to:

  1. Find models of the domains you are searching for. Depending on whether you would like to use RPSBLAST or HMMER, these might be in the NCBI conserved domain database (likely to be more complete) or e.g. in the PFAM HMM library
  2. Find and set up a sequence candidate database to use for model testing. This might be, creating a local BLAST db out of SwissProt sequences (your computer is likely not able to handle NCBI's nr database)
  3. Search your candidates using each domain model with the appropriate thresholds
  4. A candidate that is in all results (setting aside how likely this is for your domains) will have all domains

Note that some steps in there will require the knowledge of how to use the command-line BLAST or HMMER tools. Shell-level scripting will be a big advantage. This may or may not be a problem for you.

ADD COMMENT
2
Entering edit mode
13.2 years ago

How To Retrieve Human Proteins Sequence Containing A Given Domain

And I was quite happy with Marina's method using the advanced search feature at Uniprot.

http://www.uniprot.org/uniprot/?query=domain%3AANK+AND+domain%3ALRR&sort=score Feel free to add more domains.

ADD COMMENT

Login before adding your answer.

Traffic: 2129 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6