Question: Count how many gaps and mismatches using pairwise2 alignment
0
10 months ago by
bethpearson980 wrote:

I am doing a local alignment using pairwise2. I currently have some code that will count how many matches there are but I am trying to also count how many gaps and mismatches there are in the alignment too. Is there any way to do this? If possible I'd like the gaps and mismatches counted separately form each other and the position of the gaps/mismatches in the sequence would be super helpful.

sequence alignment • 578 views
modified 8 months ago by Biostar ♦♦ 20 • written 10 months ago by bethpearson980
1

Could you please share what you've tried and what is your code ?

I assume you took python module pairwise2

You can take a look at the `format_alignment` function and rewrite your own, something like this

``````def my_format_alignment(align1, align2, score, begin, end):
s = []
s.append("%s\n" % align1)
s.append(" " * begin)
gap=0
mismatch=0
for a, b in zip(align1[begin:end], align2[begin:end]):
if a == b:
s.append("|")  # match
elif a == "-" or b == "-":
s.append(" ")  # gap
gap+=1
else:
s.append(".")  # mismatch
mismatch+=1
s.append("\n")
s.append("%s\n" % align2)
s.append("  Score=%g\n" % score)
s.append("  Gap=%g\n" % gap)
s.append("  Mismatch=%g\n" % mismatch)
return ''.join(s)

from Bio import pairwise2
alignments = pairwise2.align.globalxx("ACCGT", "ACG")
from Bio.pairwise2 import format_alignment
print(format_alignment(*alignments[0]))
#ACCGT
#| ||
#A-CG-
#  Score=3
print(my_format_alignment(*alignments[0]))
#ACCGT
#| ||
#A-CG-
#  Score=3
#  Gap=2
#  Mismatch=0
``````