How to score a multiple sequence alignment (MSA)?
2
0
Entering edit mode
8.5 years ago
mithrandir ▴ 40

I want to compare two MSAs of the same set of protein sequences and determine which is better(I do not have the 'true alignment'). One way is the sum of pairs method. But I think sum of pairs is defined for a column in an alignment(average of scores of all pairs of residues). So, how do we calculate the score for the whole alignment? Do we again average the average scores obtained for each column?

Consider:

A-K
VVA
CVK


If this is the alignment and I am using BLOSUM62 as the scoring matrix.

The SOP for column 1 will be sc(AV)+ sc(VC)+ sc(AC)= a1 (let)

Similarly, for column 2 will be sc(-V)+sc(VV)+sc(-V)=a2

and then for column 3 will be sc(KA)+sc(AK)+sc(KK)=a3

Now, for the score of the MSA, do I take the average of a1, a2, a3?

And then use this score as a metric to compare between two MSAs?

MSA alignment sequence protein • 6.3k views
0
Entering edit mode

When you say you want to determine which alignment, I am not sure whether you imply you know a priori if one is better based on any criteria (not explained in your post).

But this question reminds me of qscore from Robert Edgar (author of MUSCLE software). http://www.drive5.com/qscore/. It can compare two alignments. I used this a while back, and it is UNIX compatible. I dont think there are Mac or Win versions, if that matters to you at all. I am not familiar with alistat or MstatX ideas posted by trausch.

0
Entering edit mode

Hi Anand. I do not have a reference alignment (have updated the question) hence I cannot use Edgar's qscore.

0
Entering edit mode

Thanks for the link. Is there any document or steps to run qscore code?? I am pretty much lost. Could you please direct me?

0
Entering edit mode
8.5 years ago
trausch ★ 1.9k

You can try Sean Eddy's alistat tool in the SQUID package or MstatX.

0
Entering edit mode
7.2 years ago

To Compare two alignments, i got the code ( http://www.drive5.com/qscore/). Some one could you help me how to run this? I m pretty much lost. Any document or steps. please do help me.

0
Entering edit mode

Since the program you downloaded is source code you will need to compile it. Then run the program with -h or -help option to see what in-line help it provides.

0
Entering edit mode

Thank you for the quick response. Indeed i tried compiling the same in Ubuntu 16.01 and got some errors.

gcc main.cpp -o main

In file included from /usr/include/c++/6/ext/hash_map:60:0,
from qscore.h:21,
from main.cpp:1:
/usr/include/c++/6/backward/backward_warning.h:32:2: warning: #warning This file includes at least one deprecated or antiquated header which may be removed without further notice at a future date. Please use a non-deprecated interface with equivalent functionality instead. For a listing of replacement headers and interfaces, consult the file backward_warning.h. To disable this warning use -Wno-deprecated. [-Wcpp]
#warning \
^~~~~~~
/tmp/ccQfXVdc.o: In function main':
main.cpp:(.text+0x16): undefined reference to Usage()'
main.cpp:(.text+0x38): undefined reference to ParseOptions(int, char**)'
main.cpp:(.text+0x44): undefined reference to FlagOpt(char const*)'
main.cpp:(.text+0x7e): undefined reference to ValueOpt(char const*)'
main.cpp:(.text+0x8d): undefined reference to SAB()'
main.cpp:(.text+0x99): undefined reference to QScore()'
collect2: error: ld returned 1 exit status


Gcc compiler is up to date. am i missing anything over here?

I got this code (http://www.drive5.com/qscore/) to work finally. Here are the steps (for ubuntu): 1) Open a terminal 2) Go to the director where all the .cpp files exists. 3) In the command prompt, type 'make' command to generate all the respective .o files (i got a error related unsigned max int, have added "#include <limits.h>" in header file and error got resolved). 4) Now the code is read to run, './qscore' command will give all the options to run. './qscore -test testfile -ref referencefile' is the command to compare the test alignment with the reference alignment (make sure the testfile and referencefile exists in the same path)

Thanks all for the help.

0
Entering edit mode

I tried to install qscore by running the 'make' command. But I got an error as below,

g++ -O3 -g -DNDEBUG -D_FILE_OFFSET_BITS=64 -Wall -W  -c comparepair.cpp
In file included from /usr/include/c++/4.9/ext/hash_map:60:0,
from qscore.h:21,
from comparepair.cpp:1:
/usr/include/c++/4.9/backward/backward_warning.h:32:2: warning: #warning This file includes at least one deprecated or antiquated header which may be removed without further notice at a future date. Please use a non-deprecated interface with equivalent functionality instead. For a listing of replacement headers and interfaces, consult the file backward_warning.h. To disable this warning use -Wno-deprecated. [-Wcpp]
#warning \
^
comparepair.cpp:5:28: error: ‘UINT_MAX’ was not declared in this scope
unsigned g_TestSeqIndexA = UINT_MAX;
^
comparepair.cpp:6:28: error: ‘UINT_MAX’ was not declared in this scope
unsigned g_TestSeqIndexB = UINT_MAX;
^
comparepair.cpp:7:27: error: ‘UINT_MAX’ was not declared in this scope
unsigned g_RefSeqIndexA = UINT_MAX;
^
comparepair.cpp:8:27: error: ‘UINT_MAX’ was not declared in this scope
unsigned g_RefSeqIndexB = UINT_MAX;
^
Makefile:4: recipe for target 'comparepair.o' failed
make: *** [comparepair.o] Error 1
`

Then I tried to use #include limit to check whether it makes any difference. But the error came all the time. Could anyone help me to fix this.

0
Entering edit mode

Hi. I got the same error as you. I was able to fix it by adding #include <climits> to qscore.h header file.

0
Entering edit mode

I got an error as 'Warning: reference alignment BB20001.msf has no aligned (upper-case) columns' when running ./qscore to compare test and reference sequence files './qscore -test BB20001.msf -ref BB20001_out.msf'. Could anyone help me to solve this?