Entering edit mode
8.8 years ago
gizemtatar
•
0
My needleman-wunsch global alignment scoring matrix:
G C A T G C U
0 -1 -2 -3 -4 -5 -6 -7
G -1 1 0 -1 -2 -3 -4 -5
A -2 0 0 1 0 -1 -2 -3
T -3 -1 -1 0 2 1 0 -1
T -4 -2 -2 -1 1 1 0 -1
A -5 -3 -3 -1 0 0 0 -1
C -6 -4 -2 -2 -1 -1 1 0
A -7 -5 -3 -1 -2 -2 0 0
I want traceback algorithm and find best alignment sequence.
I try for this code.
Astring <- strsplit(A, "")[[1]]
Bstring <- strsplit(B, "")[[1]]
n=length(Astring) #seq1
m=length(Bstring) #seq2
s=matrix(data = 0, nrow = m, ncol =n, dimnames = list(Bstring,Astring))
for (i in seq_len(m)) {
for (j in seq_len(n)) {
if (Astring[j]==Bstring[i])
s[i,j]<- 1
else
s[i,j]<- -1
}
}
fmatrix = matrix(0, nrow = (nchar(A)+1) , ncol = nchar(B)+1)
d = -1 #this is gap penalty
for(i in 0 : nchar(B)) {
fmatrix[i+1,1] = d * I #populates initial row with gap penalty
}
for(j in 0 : nchar(A)) {
fmatrix[1,j+1] = d * j
}
for(i in 1 : nchar(B)) {
for(j in 1 : nchar(A)) {
#get me sccore for the pair of aa or nt
match = fmatrix[i,j] + s[i,j]
delete = fmatrix[i,j+1] + d
insert = fmatrix[i+1,j] + d
fmatrix[i+1,j+1] = max(match,delete,insert)
}
}
colnames(fmatrix) = strsplit( paste(" " , A, sep=""), "")[[1]];
rownames(fmatrix) = strsplit( paste(" " , B, sep=""), "")[[1]];
#return(fmatrix)
AlignmentA = ""
AlignmentB = ""
i = nchar(B) +1
j = nchar(A) +1
maxscore=numeric(length=0)
while(i > 1 && j > 1) {
CurrentScore= fmatrix[i,j]
ScoreDiag= fmatrix[i-1,j-1]
ScoreLeft=fmatrix[i,j-1]
ScoreUp=fmatrix[i-1,j]
maxscore= max(CurrentScore,ScoreDiag,ScoreLeft,ScoreUp)
if( CurrentScore <= ScoreDiag ) {
AlignmentA=paste(substr(A,j-1,j-1),AlignmentA, sep="")
AlignmentB=paste(substr(B,i-1,i-1),AlignmentB, sep="")
j=j-1
i=i-1
} else if (ScoreLeft >= CurrentScore) {
AlignmentA=paste(substr(A,j-1,j-1),AlignmentA, sep="")
AlignmentB=paste("-",AlignmentB, sep="")
j=j-1
} else if (CurrentScore>=ScoreUp+d) {
AlignmentA=paste("-",AlignmentA, sep="")
AlignmentB=paste(substr(B,i-1,i-1),AlignmentB, sep="")
i=i-1
}
ans <- matrix(c(al1=paste(AlignmentA, collapse=""), al2=paste(AlignmentB, collapse="")))
print(ans)
}
But I don't find best alignment. The best alignments :
GCATG-CU GCA-TGCU GCAT-GCU
G-ATTACA G-ATTACA G-ATTACA
Can you help this topic please?
Why reinvent the wheel when
pairwiseAlignment()
is a well tested function fromBiostrings
that does all of this for you anyway?I need manual R code.
For an assignment?