Implementation Of Blosum62 In The Source Code Of Global Pairwise Alignment Of Proteins
3
5
Entering edit mode
14.2 years ago
Anshu ▴ 50

Hi,

I am trying to implement protein pairwise sequence alignment using "Global Alignment" algorithm by 'Needleman -Wunsch'. I am using VB.NET.

I am not clear about how to include 'Blosum62 Matrix' in my source code to do the scoring or to fill the two-dimensional matrix?

I have googled and found that most people suggested to use flat file which contains the standard 'Blosum62 Matrix'. Does it mean that I need to read from this flat file and fill my coded "Blosum62 Martrix'?

Also, the other approach could be is to use some mathematical formula and include it in your programming logic to construct 'Blosum62 Matrix'. But not very sure about this option.

Any ideas or insights are appreciated.

Also, is there any pseudo algorithm to do the protein pairwise alignment using Global available? I tired to find the basic steps of the algorithm online but no luck so I am planning to do the same steps as I did for the global pairwise alignment of Nucleotides

Thanks

alignment sequence • 11k views
ADD COMMENT
6
Entering edit mode
14.2 years ago

There are no mathematical formulas for this.

What you need is a data structure that you can use to retrieve the score for substitutions that you observe. It could be as simple as as hash map. For example in Python you could initialize it like so:

blosum = dict()
blosum['Ala'] = dict()
blosum['Ala']['Ala'] = 4
blosum['Ala']['Arg'] = -1
blosum['Ala']['Asn'] = -2

... etc ...

Of course you would not need to initialize it by hand, the information should be read from a file, that way you can load different scoring matrices. Later during alignment when you observe an Ala -> Arg substitution you could retrieve the value as:

blosum['Ala']['Arg']

Use the corresponding data structure from your programming language to build the same construct.

ADD COMMENT
0
Entering edit mode

Thanks Istvan for your suggestion. I will work on the same lines.

ADD REPLY
6
Entering edit mode
14.1 years ago
Bilouweb ★ 1.1k

My answer comes late but I just discovered this web site. Instead of writing the blosum matrix in a data struct, I think it is a better idea to create a function to read your matrix in a text file. Thus, if you want to try another scoring matrix than blosum 62, you just have to read another file.

Bilou.

[Edit]

The Needleman-Wunsch algorithm is a simple dynamic programming approach. Perhaps this page can helps you with pseudo-code: http://en.wikipedia.org/wiki/Dynamic_programming

ADD COMMENT
1
Entering edit mode
14.1 years ago

sorry, I'm only speaking 'java' here.

I would create an interface ScoreMatrix:

public interface ScoreMatrix
     {
     public int getScore(char aa1,char aa2);
     }

that would be used by your AlignmentTool

public interface AlignmentTool
     {
     public void setScoreMatrix(ScoreMatrix m);
     public ScoreMatrix getScoreMatrix();
     public void align(String seq1,String seq);
     (...)
     }

and Blosum62 would be an implementation of ScoreMatrix

public class Blosum62 implements ScoreMatrix
    {
    public int getScore(char aa1,char aa2) 
        {
        switch(upper(aa1))
          {
          (...)
           {
           case 'A' :
           switch(upper(aa2))
             {
             (...)
             case 'A': return 98;
             (...)
             }
          }
        }
    }
ADD COMMENT

Login before adding your answer.

Traffic: 2861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6