Question: Multiple From Pairwise, Is It Always Possible?
0
8.2 years ago by
Vishalkpp0 wrote:

why is it generally not possible to directly construct multiple sequence alignment from pairwise alignments

modified 8.2 years ago by Jake Mick50 • written 8.2 years ago by Vishalkpp0
2

A rather confusing question. (1) Why in the title do you ask "is it always possible", but then in the question ask "why is it generally not possible"? (2) One way to construct MSAs is, in fact, to combine pairwise alignments: read http://en.wikipedia.org/wiki/Multiple_sequence_alignment. Not sure what you mean by "directly construct". (3) This sounds like a homework question. If it isn't, please expand on your problem and what you've tried so far to solve it.

From an optimal multiple alignment, we can infer pairwise alignments between all pairs of sequences. I was asked in my exam that why is it difficult to infer a good multiple alignment from optimal pairwise alignments. This isn't a homework question. I would like to know the reason behind this.

3
8.2 years ago by
Gustavo530
Seattle
Gustavo530 wrote:

If the sequences are strictly collinear and sufficiently similar to each other, it is generally possible to combine pairwise alignments into multiple.

There are various situations in which this may not work.

In the global scenario, imagine two related (and alignable) sequences A and B, both being well alignable to a longer sequence C that has the form a...b where a and b are subsequences similar (or even identical) to A and B, and the dots denote extra stuff (or perhaps nothing) in-between. Depending on the relative similarity levels of A to a, A to B and B to b, one may get discordant results, which will somewhat arbitrarily depend on the specific protocol used for constructing the MSA from the pairs. Add to the mix several other sequences with partial pairwise matches and some internal repetition, and it can get very complex.

In the local scenario, the same situation arises when the aforementioned A, B and C are subsequences longer sequences, e.g. x1-A-y1, x2-B-y2 and x3-C-y3, where the x subsequences can be aligned to each other, and the y subsequences can be aligned to each other as well. The pairwise matches between A, B and C may lead to locally irreconcilable multiple alignment. Again, with several sequences this can get very hairy fast.

In the more generic scenario, the various sequences may display a combination of such local discrepancies, and larger-scale discrepancies (e.g. for multi-domain proteins).

Finally, for highly divergent sequences, the pairwise alignments may fail to be significant (so they may have significant error, or be essentially arbitrary) while the MSA may recover the true similarity between the sequences as a group. This is why for very divergent sequences, a pairwise alignment extracted from an MSA may look strange and arbitrary - and score less than the optimal pairwise alignment between the same two sequences.

thanks for the explanation

0
8.2 years ago by
Ketil4.0k
Germany
Ketil4.0k wrote:

I think it is always possible (I think CLUSTAL etc start from pairwise alignments), but the result might not be very good. I think an optimal multiple alignment requires O(n^k) operations (where n is sequence lengths, and k number of sequences). All pairwise alignments take only O(k*n^2), so it'd be very surprising if you could derive an optimal MSA from this - at least without an exponential number of operations.

0
8.2 years ago by
Jake Mick50
Columbia, Missouri
Jake Mick50 wrote:

This is a silly question. One of the first popular multiple sequence alignment software actually produced pairwise alignments in the initial step. (Viewing the problem as template construction) the search space for the MSA of a nontrivial MxN set of sequences is combinatorially vast. By constructucting pairwise alignments one derives many templates. The set intersect of these templates in sequence space should be near the location of the full sequence alignment template. This is then the seed template for further refinement, finding the local minimum from a objective function from the seed.