Question: Multiple Sequence Alignment Score
4
gravatar for Lee Katz
8.1 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

Does anyone have a script to determine the score of a multiple sequence alignment? Hopefully using BioPerl?

ADD COMMENTlink written 8.1 years ago by Lee Katz2.9k
4

Look at this thread: http://biostar.stackexchange.com/questions/5083/similarity-score-of-multiple-sequence-alignment

ADD REPLYlink written 8.1 years ago by Alastair Kerr5.2k

I almost thought that someone had asked this until I went into the question, but it looks like it hasn't come up yet. http://biostar.stackexchange.com/questions/1086/pariwise-local-and-global-alignment

ADD REPLYlink modified 7.0 years ago by Istvan Albert ♦♦ 80k • written 8.1 years ago by Lee Katz2.9k
2
gravatar for Lee Katz
8.1 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

Per Alastair's comment, I tried MstatX, at https://github.com/gcollet/MstatX.

The program is easy to use and has several ways of calculating the score. However, it bothers me a little that it doesn't output a final score to the termal. It prints a score for each column into a file, which I was able to sum up. It also bothers me that it doesn't come packaged with a simple matrix for DNA and so it is only optimized for protein. I quickly made up a DNA matrix on the spot which may not be technically correct.

tar zxvf gcollet-MstatX-31481c6.tar.gz
cd gcollet-MstatX-31481c6
make
./mstatx -ma path/to/file -b -sp data/dna.mat
perl -e 'while(<>){$score+=$_;}print "$score\n";' < file.cons

The file I made (probably isn't the most correct thing I could have made)

H DNA matrix
D DNA matrix by Lee Katz
R LIT:1902106 PMID:1438297
A Henikoff, S. and Henikoff, J.G.
T Amino acid substitution matrices from protein blocks
J Proc. Natl. Acad. Sci. USA 89, 10915-10919 (1992)
* matrix in 1/3 Bit Units
M rows = ATCGN-, cols = ATCGN-
      2.
     -1.      2.
     -1.     -1.      2.
     -1.     -1.     -1.      2.
     -1.     -1.     -1.     -1.     -2.
     -2.     -2.     -2.     -2.     -2.     -2.
//
ADD COMMENTlink written 8.1 years ago by Lee Katz2.9k
1

I also tried alistat from the squid package, which does not give a score

ADD REPLYlink written 8.1 years ago by Lee Katz2.9k
1

MstatX now give a global score of an alignment = the sum of column scores divided by the number of columns.

ADD REPLYlink written 8.1 years ago by Bilouweb1.1k
0
gravatar for hazratyaaseen
7.0 years ago by
South Africa
hazratyaaseen0 wrote:

muscle3.8 has a 'spscore' option which computes an SP objective score for a multiple sequence alignment. e.g. path/to/muscle -spscore file_name

e.g. to extract just the score into a variable (psuedocode):

Compute SP score with muscle (e.g. path/to/muscle -spscore file_name -log <log_file>)
Read log file
Iterate through each line of file
    If line contains string 'SP=' (perl e.g. /SP=/)
        match 'SP=' (perl e.g. =~ /SP=/)
        Print segment after match (perl e.g. print $')

NB: You could even extract the matching line with unix 'grep' Download muscle from: http://www.drive5.com/muscle/

"The Father. The Son. The Holy Spirit. And the South African National Bioinformatics Institute."

ADD COMMENTlink modified 7.0 years ago by Istvan Albert ♦♦ 80k • written 7.0 years ago by hazratyaaseen0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1618 users visited in the last hour