Question: Implementation Of Blosum62 In The Source Code Of Global Pairwise Alignment Of Proteins
gravatar for Anshu
5.7 years ago by
Anshu50 wrote:


I am trying to implement protein pairwise sequence alignment using "Global Alignment" algorithm by 'Needleman -Wunsch'. I am using VB.NET.

I am not clear about how to include 'Blosum62 Matrix' in my source code to do the scoring or to fill the two-dimensional matrix?

I have googled and found that most people suggested to use flat file which contains the standard 'Blosum62 Matrix'. Does it mean that I need to read from this flat file and fill my coded "Blosum62 Martrix' ?

Also, the other approach could be is to use some mathematical formula and include it in your programming logic to construct 'Blosum62 Matrix'. But not very sure about this option.

Any ideas or insights are appreciated.

Also, is there any pesudo algorithm to do the protein pairwise alignment using Global available? I tired to find the basic steps of the alogrithm online but no luck so I am planning to do the same steps as I did for the global pairwise alignment of Nucleotides


sequence alignment scoring • 3.7k views
ADD COMMENTlink modified 5.6 years ago by Bilouweb1.0k • written 5.7 years ago by Anshu50
gravatar for Istvan Albert
5.7 years ago by
Istvan Albert ♦♦ 58k
University Park, USA
Istvan Albert ♦♦ 58k wrote:

There are no mathematical formulas for this.

What you need is a data structure that you can use to retrieve the score for substitutions that you observe. It could be as simple as as hash map. For example in Python you could initialize it like so:

blosum = dict()
blosum['Ala'] = dict()
blosum['Ala']['Ala'] = 4
blosum['Ala']['Arg'] = -1
blosum['Ala']['Asn'] = -2 
... etc ...

Of course you would not need to initialize it by hand, the information should be read from a file, that way you can load different scoring matrices. Later during alignment when you observe an Ala -> Arg substitution you could retrieve the value as:


Use the corresponding data structure from your programming language to build the same construct.

ADD COMMENTlink modified 5.7 years ago • written 5.7 years ago by Istvan Albert ♦♦ 58k

Thanks Istvan for your suggestion. I will work on the same lines.

ADD REPLYlink written 5.7 years ago by Anshu50
gravatar for Bilouweb
5.7 years ago by
Saclay, France
Bilouweb1.0k wrote:

My answer comes late but I just discovered this web site.

Instead of writing the blosum matrix in a data struct, I think it is a better idea to create a function to read your matrix in a text file.

Thus, if you want to try another scoring matrix than blosum 62, you just have to read another file.



The Needleman-Wunsch algorithm is a simple dynamic programming approach. Perhaps this page can helps you with pseudo-code :

ADD COMMENTlink modified 5.6 years ago • written 5.7 years ago by Bilouweb1.0k
gravatar for Pierre Lindenbaum
5.7 years ago by
Pierre Lindenbaum77k wrote:

sorry, I'm only speaking 'java' here.

I would create an interface ScoreMatrix:

public interface ScoreMatrix
     public int getScore(char aa1,char aa2);

that would be used by your AlignmentTool

public interface AlignmentTool
     public void setScoreMatrix(ScoreMatrix m);
     public ScoreMatrix getScoreMatrix();
     public void align(String seq1,String seq);

and Blosum62 would be an implementation of ScoreMatrix

public class Blosum62 implements ScoreMatrix
    public int getScore(char aa1,char aa2) 
           case 'A' :
             case 'A': return 98;
ADD COMMENTlink written 5.7 years ago by Pierre Lindenbaum77k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 487 users visited in the last hour