Question: How to score a multiple sequence alignment (MSA)?
0
gravatar for mithrandir
23 months ago by
mithrandir40
India
mithrandir40 wrote:

I want to compare two MSAs of the same set of protein sequences and determine which is better(I do not have the 'true alignment'). One way is the sum of pairs method. But I think sum of pairs is defined for a column in an alignment(average of scores of all pairs of residues). So, how do we calculate the score for the whole alignment? Do we again average the average scores obtained for each column?

Consider:

A-K

VVA

CVK

If this is the alignment and I am using BLOSUM62 as the scoring matrix.

The SOP for column 1 will be sc(AV)+ sc(VC)+ sc(AC)= a1 (let)

Similarly, for column 2 will be sc(-V)+sc(VV)+sc(-V)=a2

and then for column 3 will be sc(KA)+sc(AK)+sc(KK)=a3

Now, for the score of the MSA, do I take the average of a1 , a2, a3 ?

And then use this score as a metric to compare between two MSAs?

Please let me know of any papers or resources that talk about this. Thanks

 

msa protein alignment sequence • 1.6k views
ADD COMMENTlink modified 2 days ago by hmnipunad0 • written 23 months ago by mithrandir40

When you say you want to determine which alignment, I am not sure whether you imply you know a priori if one is better based on any criteria (not explained in your post).
But this question reminds me of qscore from Robert Edgar (author of MUSCLE software). http://www.drive5.com/qscore/.  It can compare two alignments. I used this a while back, and it is UNIX compatible. I dont think there are Mac or Win versions, if that matters to you at all. I am not familiar with alistat or MstatX ideas posted by trausch.

ADD REPLYlink written 23 months ago by Anand Rao100

Hi Anand. I do not have a reference alignment(have updated the question) hence I cannot use Edgar's qscore. 

ADD REPLYlink written 23 months ago by mithrandir40

Thanks for the link. Is there any document or steps to run qscore code?? I am pretty much lost. Could you please direct me?

ADD REPLYlink written 6 months ago by sanju.bankapur0
0
gravatar for trausch
23 months ago by
trausch760
Germany
trausch760 wrote:

You can try Sean Eddy's alistat tool in the SQUID package

http://selab.janelia.org/software.html

or MstatX

https://github.com/gcollet/MstatX

ADD COMMENTlink written 23 months ago by trausch760
0
gravatar for sanju.bankapur
6 months ago by
sanju.bankapur0 wrote:

To Compare two alignments, i got the code ( http://www.drive5.com/qscore/). Some one could you help me how to run this? I m pretty much lost. Any document or steps. please do help me.

ADD COMMENTlink written 6 months ago by sanju.bankapur0

Since the program you downloaded is source code you will need to compile it. Then run the program with -h or -help option to see what in-line help it provides.

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax39k

Thank you for the quick response. Indeed i tried compiling the same in Ubuntu 16.01 and got some errors.


gcc main.cpp -o main

In file included from /usr/include/c++/6/ext/hash_map:60:0,
                 from qscore.h:21,
                 from main.cpp:1:
/usr/include/c++/6/backward/backward_warning.h:32:2: warning: #warning This file includes at least one deprecated or antiquated header which may be removed without further notice at a future date. Please use a non-deprecated interface with equivalent functionality instead. For a listing of replacement headers and interfaces, consult the file backward_warning.h. To disable this warning use -Wno-deprecated. [-Wcpp]
 #warning \
  ^~~~~~~
/tmp/ccQfXVdc.o: In function `main':
main.cpp:(.text+0x16): undefined reference to `Usage()'
main.cpp:(.text+0x38): undefined reference to `ParseOptions(int, char**)'
main.cpp:(.text+0x44): undefined reference to `FlagOpt(char const*)'
main.cpp:(.text+0x7e): undefined reference to `ValueOpt(char const*)'
main.cpp:(.text+0x8d): undefined reference to `SAB()'
main.cpp:(.text+0x99): undefined reference to `QScore()'
collect2: error: ld returned 1 exit status

Gcc compiler is up to date. am i missing anything over here?


I got this code (http://www.drive5.com/qscore/) to work finally. Here are the steps (for ubuntu): 1) Open a terminal 2) Go to the director where all the .cpp files exists. 3) In the command prompt, type 'make' command to generate all the respective .o files (i got a error related unsigned max int, have added "#include <limits.h>" in header file and error got resolved). 4) Now the code is read to run, './qscore' command will give all the options to run. './qscore -test testfile -ref referencefile' is the command to compare the test alignment with the reference alignment (make sure the testfile and referencefile exists in the same path)

Thanks all for the help.

ADD REPLYlink modified 6 months ago • written 6 months ago by sanju.bankapur0

I tried to install qscore by running the 'make' command. But I got an error as below,

g++ -O3 -g -DNDEBUG -D_FILE_OFFSET_BITS=64 -Wall -W -c comparepair.cpp In file included from /usr/include/c++/4.9/ext/hash_map:60:0, from qscore.h:21, from comparepair.cpp:1: /usr/include/c++/4.9/backward/backward_warning.h:32:2: warning: #warning This file includes at least one deprecated or antiquated header which may be removed without further notice at a future date. Please use a non-deprecated interface with equivalent functionality instead. For a listing of replacement headers and interfaces, consult the file backward_warning.h. To disable this warning use -Wno-deprecated. [-Wcpp] #warning \ ^ comparepair.cpp:5:28: error: ‘UINT_MAX’ was not declared in this scope unsigned g_TestSeqIndexA = UINT_MAX; ^ comparepair.cpp:6:28: error: ‘UINT_MAX’ was not declared in this scope unsigned g_TestSeqIndexB = UINT_MAX; ^ comparepair.cpp:7:27: error: ‘UINT_MAX’ was not declared in this scope unsigned g_RefSeqIndexA = UINT_MAX; ^ comparepair.cpp:8:27: error: ‘UINT_MAX’ was not declared in this scope unsigned g_RefSeqIndexB = UINT_MAX; ^ Makefile:4: recipe for target 'comparepair.o' failed make: * [comparepair.o] Error 1

Then I tried to use #include limit to check whether it makes any difference. But the error came all the time. Could anyone help me to fix this.

ADD REPLYlink modified 2 days ago • written 2 days ago by hmnipunad0
0
gravatar for hmnipunad
2 days ago by
hmnipunad0
hmnipunad0 wrote:

I got an error as 'Warning: reference alignment BB20001.msf has no aligned (upper-case) columns' when running ./qscore to compare test and reference sequence files './qscore -test BB20001.msf -ref BB20001_out.msf'. Could anyone help me to solve this?

ADD COMMENTlink written 2 days ago by hmnipunad0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 966 users visited in the last hour