Question

Secondary Structure Alignment Method

2

Entering edit mode

10.4 years ago

virpatel3 ▴ 20

I have many genes for many bacteriophage with precomputed secondary structures such as:

HHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHCHHHHHHHH

I want to align each gene's secondary structure with the SSE (secondary structure elements) of the other bacteriophages. So for instance, I may also have

HHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHCCHHHHHHH

for another gene. I would like to be able to align these genes so that I can find the evolutionary relationships between the bacteriophages.

Does anyone have a suggestion for the algorithms I should use for this project?

alignment secondary-structure • 3.8k views

ADD COMMENT • link updated 3.2 years ago by Ram 45k • written 10.4 years ago by virpatel3 ▴ 20

1

Entering edit mode

I never faced the problem of aligning secondary structure so I'm probably overlooking a lot of complications, but by the way the question is asked the first and easiest thing that comes to my mind is to align one set of genes vs the other with BLAST.

ADD REPLY • link 10.4 years ago by dariober 15k

1

Entering edit mode

Blast is good but him ask for the alignment of the secondary structures and i think that blast is not good in this case. yes is true that the blast is a good on line tools for alignment but the problem is that blast compute the alignment on the basis of a database searching the possible releated sequences.

in your case virpatel3 i'm suggest to create an algorithm of alignment.

you can search on google examples about this.

sure i never heard about the alignment of SS elements.

ADD REPLY • link 10.4 years ago by a.polo88 ▴ 120

0

Entering edit mode

You could use Biopython's alignment module pairwise2 with a custom scoring function. Documentation

ADD REPLY • link 5.9 years ago by cschu181 ★ 2.8k

Ram · Answer 1 · 2015-02-22

OK I tried on the website clustalW

It seems to work

http://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=clustalw2-I20150222-140308-0817-61234194-pg

Try it!

If you will use it remember that fro each of sequences the format is

> name of protein or what do you want
sequence

Example:

> name
HCCCHHHC
>name1
HCCCHHHHHHHC

Results of your examples

http://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=clustalw2-I20150222-140838-0639-99240772-oy

Ram · Answer 2 · 2019-08-21

If I correctly remember , there was a paper in PNAS by George D Rose in late 90s which dealt with this. The goal was to find out structurally similar sequences based on predicted secondary structure. So far I remember they employed a simple scoring system (match/mismatch binary type):

$enter image description here$

Edit: Here's the paper: "Seeking an ancient enzyme in Methanococcus jannaschii using orf, a program based on predicted secondary structure comparisons"

https://www.pnas.org/content/95/6/2818