Find conserved motif
2
0
Entering edit mode
9.4 years ago
biolab ★ 1.4k

Dear all

I have the protein sequences encoded by a gene from a few organisms. I make an example below:

species1   AHLLAHLAHHLALJALHA
#                      ^----^
species2   AHLHAHLAHHHJJJAHHJ
species3   ALLLLLLAHLLHLJALHA
#                      ^----^

If performing CLUSTALX, species1 and species2 are group together, because of high overall similarity, however, I would like to find the motifs that could group species1 and species3, for example, the terminal LJALHA motif.

I appreciate any of your suggestions. THANKS!

alignment • 2.0k views
ADD COMMENT
1
Entering edit mode
9.4 years ago

There are very short sequences that are common between different combinations (Highlighted below). So, you need to first make a set of rules that defines a motif. Then based on that anybody can suggest you a tool or provide a short script to do that.

species1   AHLLAHLAHHLALJALHA
#              ^---^   ^----^
species2   AHLHAHLAHHHJJJAHHJ
#              ^---^
species3   ALLLLLLAHLLHLJALHA
#                      ^----^

There are motif discovery programs (e.g MEME) and python libraries (e.g Motility) that can calculate Position weighted matrices or motifs, but you need to provide more information on your problem.

ADD COMMENT
0
Entering edit mode

Hi, Geeky, thanks. Actually my analysis focuses on one protein. I found that if doing CLUSTALX, species1 and 2 are grouped together. However, this is not true and is different from taxonomic classifications. I need to find motifs (for example, larger than 7 amino acids) that can group species1 and 3, and separate species2. This is my general aim.

I will try MEME. If you have any ideas, please feel free to comment. THANKS.

ADD REPLY
1
Entering edit mode
9.4 years ago
Asaf 10k

You can supply ClustalX with the real tree so it should highlight the similarities between them.

I think that defining a motif for 2 sequences is basically finding a long stretch of AA that are similar (or very close), you can do it by simply align them (using needle of EMBOSS for instance) and locating stretches of 7 AA that are similar.

ADD COMMENT
0
Entering edit mode

Thanks a lot, Asaf, your comments are really helpful!

ADD REPLY

Login before adding your answer.

Traffic: 1706 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6