traceback algorithm global alignment in R code
1
0
Entering edit mode
8.8 years ago
gizemtatar • 0

My needleman-wunsch global alignment scoring matrix:

        G    C    A   T    G   C    U
    0   -1  -2   -3  -4   -5  -6   -7
G  -1   1    0   -1  -2   -3  -4   -5
A  -2   0    0    1   0   -1  -2   -3
T  -3  -1   -1    0   2    1   0   -1
T  -4  -2   -2   -1   1    1   0   -1
A  -5  -3   -3   -1   0    0   0   -1
C  -6  -4   -2   -2  -1   -1   1    0
A  -7  -5   -3   -1  -2   -2   0    0

I want traceback algorithm and find best alignment sequence.

I try for this code.

Astring <- strsplit(A, "")[[1]]
Bstring <- strsplit(B, "")[[1]]

n=length(Astring) #seq1
m=length(Bstring) #seq2

s=matrix(data = 0, nrow = m, ncol =n, dimnames = list(Bstring,Astring))

for (i in seq_len(m)) {
    for (j in seq_len(n)) {
        if (Astring[j]==Bstring[i])
            s[i,j]<- 1
        else
            s[i,j]<- -1
    }
}

fmatrix = matrix(0, nrow = (nchar(A)+1) , ncol = nchar(B)+1)
d = -1  #this is gap penalty
for(i in 0 : nchar(B)) {
    fmatrix[i+1,1] = d * I  #populates initial row with gap penalty
}

for(j in 0 : nchar(A)) {
    fmatrix[1,j+1] = d * j
}

for(i in 1 : nchar(B)) {
    for(j in 1 : nchar(A)) {
        #get me sccore for the pair of aa or nt
        match  =  fmatrix[i,j] + s[i,j]
        delete = fmatrix[i,j+1] + d
        insert = fmatrix[i+1,j] + d
        fmatrix[i+1,j+1] = max(match,delete,insert)
    }
}

colnames(fmatrix) = strsplit( paste(" " , A, sep=""), "")[[1]];
rownames(fmatrix) = strsplit( paste(" " , B, sep=""), "")[[1]];
#return(fmatrix)

AlignmentA = ""
AlignmentB = ""

i = nchar(B) +1
j = nchar(A) +1

maxscore=numeric(length=0)

while(i > 1 && j > 1) {
    CurrentScore= fmatrix[i,j]
    ScoreDiag= fmatrix[i-1,j-1]
    ScoreLeft=fmatrix[i,j-1]
    ScoreUp=fmatrix[i-1,j]
    maxscore= max(CurrentScore,ScoreDiag,ScoreLeft,ScoreUp)

    if( CurrentScore <= ScoreDiag ) {
        AlignmentA=paste(substr(A,j-1,j-1),AlignmentA, sep="")
        AlignmentB=paste(substr(B,i-1,i-1),AlignmentB, sep="")
        j=j-1
        i=i-1
    } else if (ScoreLeft >= CurrentScore) {
        AlignmentA=paste(substr(A,j-1,j-1),AlignmentA, sep="")
        AlignmentB=paste("-",AlignmentB, sep="")
        j=j-1
    } else if (CurrentScore>=ScoreUp+d) {
        AlignmentA=paste("-",AlignmentA, sep="")
        AlignmentB=paste(substr(B,i-1,i-1),AlignmentB, sep="")            
        i=i-1
    }
    ans <- matrix(c(al1=paste(AlignmentA, collapse=""), al2=paste(AlignmentB, collapse="")))
    print(ans)
}

But I don't find best alignment. The best alignments :

GCATG-CU      GCA-TGCU      GCAT-GCU
G-ATTACA      G-ATTACA      G-ATTACA

Can you help this topic please?

R alignment sequence • 4.5k views
ADD COMMENT
1
Entering edit mode

Why reinvent the wheel when pairwiseAlignment() is a well tested function from Biostrings that does all of this for you anyway?

ADD REPLY
0
Entering edit mode

I need manual R code.

ADD REPLY
1
Entering edit mode

For an assignment?

ADD REPLY
0
Entering edit mode
4.5 years ago
parkkh1188 • 0
fmatrix = matrix(0,nrow = (nchar(A)+1), ncol = nchar(B)+1)

Switch to

fmatrix = matrix(0,nrow = (nchar(B)+1), ncol = nchar(A)+1)

Switch nchar(A) to nchar(B), nchar(B) to nchar(A)

ADD COMMENT

Login before adding your answer.

Traffic: 1517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6