Question: How to score a multiple sequence alignment (MSA)?
0
4.7 years ago by
mithrandir40
India
mithrandir40 wrote:

I want to compare two MSAs of the same set of protein sequences and determine which is better(I do not have the 'true alignment'). One way is the sum of pairs method. But I think sum of pairs is defined for a column in an alignment(average of scores of all pairs of residues). So, how do we calculate the score for the whole alignment? Do we again average the average scores obtained for each column?

Consider:

A-K

VVA

CVK

If this is the alignment and I am using BLOSUM62 as the scoring matrix.

The SOP for column 1 will be sc(AV)+ sc(VC)+ sc(AC)= a1 (let)

Similarly, for column 2 will be sc(-V)+sc(VV)+sc(-V)=a2

and then for column 3 will be sc(KA)+sc(AK)+sc(KK)=a3

Now, for the score of the MSA, do I take the average of a1 , a2, a3 ?

And then use this score as a metric to compare between two MSAs?

msa protein alignment sequence • 3.9k views
modified 2.8 years ago by hmnipunad0 • written 4.7 years ago by mithrandir40

When you say you want to determine which alignment, I am not sure whether you imply you know a priori if one is better based on any criteria (not explained in your post).

But this question reminds me of qscore from Robert Edgar (author of MUSCLE software). http://www.drive5.com/qscore/. It can compare two alignments. I used this a while back, and it is UNIX compatible. I dont think there are Mac or Win versions, if that matters to you at all. I am not familiar with alistat or MstatX ideas posted by trausch.

Hi Anand. I do not have a reference alignment (have updated the question) hence I cannot use Edgar's qscore.

Thanks for the link. Is there any document or steps to run qscore code?? I am pretty much lost. Could you please direct me?

0
4.7 years ago by
trausch1.5k
Germany
trausch1.5k wrote:

You can try Sean Eddy's alistat tool in the SQUID package or MstatX.

0
3.3 years ago by
sanju.bankapur0 wrote:

To Compare two alignments, i got the code ( http://www.drive5.com/qscore/). Some one could you help me how to run this? I m pretty much lost. Any document or steps. please do help me.

Since the program you downloaded is source code you will need to compile it. Then run the program with -h or -help option to see what in-line help it provides.

Thank you for the quick response. Indeed i tried compiling the same in Ubuntu 16.01 and got some errors.

gcc main.cpp -o main

In file included from /usr/include/c++/6/ext/hash_map:60:0,
from qscore.h:21,
from main.cpp:1:
/usr/include/c++/6/backward/backward_warning.h:32:2: warning: #warning This file includes at least one deprecated or antiquated header which may be removed without further notice at a future date. Please use a non-deprecated interface with equivalent functionality instead. For a listing of replacement headers and interfaces, consult the file backward_warning.h. To disable this warning use -Wno-deprecated. [-Wcpp]
#warning \
^~~~~~~
/tmp/ccQfXVdc.o: In function `main':
main.cpp:(.text+0x16): undefined reference to `Usage()'
main.cpp:(.text+0x38): undefined reference to `ParseOptions(int, char**)'
main.cpp:(.text+0x44): undefined reference to `FlagOpt(char const*)'
main.cpp:(.text+0x7e): undefined reference to `ValueOpt(char const*)'
main.cpp:(.text+0x8d): undefined reference to `SAB()'
main.cpp:(.text+0x99): undefined reference to `QScore()'
collect2: error: ld returned 1 exit status

Gcc compiler is up to date. am i missing anything over here?

I got this code (http://www.drive5.com/qscore/) to work finally. Here are the steps (for ubuntu): 1) Open a terminal 2) Go to the director where all the .cpp files exists. 3) In the command prompt, type 'make' command to generate all the respective .o files (i got a error related unsigned max int, have added "#include <limits.h>" in header file and error got resolved). 4) Now the code is read to run, './qscore' command will give all the options to run. './qscore -test testfile -ref referencefile' is the command to compare the test alignment with the reference alignment (make sure the testfile and referencefile exists in the same path)

Thanks all for the help.

I tried to install qscore by running the 'make' command. But I got an error as below,

g++ -O3 -g -DNDEBUG -D_FILE_OFFSET_BITS=64 -Wall -W  -c comparepair.cpp
In file included from /usr/include/c++/4.9/ext/hash_map:60:0,
from qscore.h:21,
from comparepair.cpp:1:
/usr/include/c++/4.9/backward/backward_warning.h:32:2: warning: #warning This file includes at least one deprecated or antiquated header which may be removed without further notice at a future date. Please use a non-deprecated interface with equivalent functionality instead. For a listing of replacement headers and interfaces, consult the file backward_warning.h. To disable this warning use -Wno-deprecated. [-Wcpp]
#warning \
^
comparepair.cpp:5:28: error: ‘UINT_MAX’ was not declared in this scope
unsigned g_TestSeqIndexA = UINT_MAX;
^
comparepair.cpp:6:28: error: ‘UINT_MAX’ was not declared in this scope
unsigned g_TestSeqIndexB = UINT_MAX;
^
comparepair.cpp:7:27: error: ‘UINT_MAX’ was not declared in this scope
unsigned g_RefSeqIndexA = UINT_MAX;
^
comparepair.cpp:8:27: error: ‘UINT_MAX’ was not declared in this scope
unsigned g_RefSeqIndexB = UINT_MAX;
^
Makefile:4: recipe for target 'comparepair.o' failed
make: *** [comparepair.o] Error 1

Then I tried to use #include limit to check whether it makes any difference. But the error came all the time. Could anyone help me to fix this.

Hi. I got the same error as you. I was able to fix it by adding #include <climits> to qscore.h header file.

0
2.8 years ago by