Hi everyone,
I am new to python, so sorry if my question or code seems a bit weird or newbie.
I want to calculate the conservation in two sequence with a specific window size. I actually wrote the code, and it looks like this in below.
My question is, is there any shorter way of doing that using Biopython? I read their tutorial but I really couldn't find anything that I can use here. I found ways that the code tells you the overall similarity score between 2 sequences, but I want to be able check the similarity/conservation in a specific window size and eventually plot the the data.
I appreciate any suggestion/ help from you guys!
Thanks and have a nice Day
Sarah
> Blockquote
from Bio import AlignIO
import matplotlib.pyplot as plt
# Compares 2 sequences in a specific window size, and prints the difference
def compare(word1, word2, window):
chunk_count = 0
list = []
length_word1 = len(word1)
length_word2 = len(word2)
if length_word1 < length_word2:
min_word = word1
else:
min_word = word2
while chunk_count < (len(min_word) - 1):
count = 0
if chunk_count + window > len(min_word):
window = len(min_word)-chunk_count
for window_counter in range(window):
if word1[window_counter + chunk_count] == word2[window_counter + chunk_count]:
count = count + 1
list.append(count)
chunk_count = chunk_count + window
print(list)
length= len (list)
print (length)
list3 = [0]
x=0
for i in range (length-1):
x= x+10
list3.append(x)
print (list3)
def main():
alignment = AlignIO.read("Alignment 2.aln", "clustal")
word1 = (alignment[0].seq)
word2 = (alignment[2].seq)
window = 20
compare(word1, word2, window)
> Blockquote
Can you clarify, are you looking for differences between direct substrings (e.g. a Hamming or Levenshtein distance), or are you looking to re-align the windows?