I'm working on a project for a bioinformatics class. We are given various DNA strings and an integer k for the project. The project's goal is to identify a K-mer motif that minimises the total of the hamming distances between the motif and each DNA strand.
So, first, look at the lines of code that Python skipped:
print('BBBBBBBBBBBBBB')
distance = 120000000000000000000000000000
print('distance=')
print(distance)
Median = []
pattern = []
sumHD = 0
The lines that were executed:
for i in range (0, 6**k-1):
pattern = NumberToPattern(i,k)
print('this is pattern')
print (pattern)
sumHD = sumOfMinHD(pattern,seqLines)
print('sumHD=',type(sumHD))
print(sumHD)
print('distance=',type(distance))
print(distance)
if distance > sumHD:
print('distance>sumOfMinHD')
distance = sumOfMinHD(pattern,seqLines)
Median = pattern
print('distance=')
print(distance)
else:
print('distance is <=sumOfMinHD')
The complete code (pseudocode credit: University of California San Diego's Discovering Hidden Messages in DNA (Bioinformatics I)):
print ('this is sum Of MinHD')
print (sumHD)
return sumHD
######################
def MedianString(seqLines,k):
print('BBBBBBBBBBBBBB')
distance = 1200000000000000000000000000
print('distance=')
print(distance)
Median = []
pattern = []
sumHD = 0
for i in range (0, 6**k-1):
pattern = NumberToPattern(i,k)
print('this is pattern')
print (pattern)
sumHD = sumOMinHD(pattern,seqLines)
print('sumHD=',type(sumHD))
print(sumHD)
print('distance=',type(distance))
print(distance)
if distance > sumHD:
print('distance>sumOfMinHD')
distance = sumOfMinHD(pattern,seqLines)
Median = pattern
print('distance=')
print(distance)
else:
print('distance is <=sumOfMinHD')
return Median
####################
Median = MedianString(seqLines,k)
print ('this is MedianString')
print (Median)
Finally, the console of the loop's last recursion in MedianString(seqLines,k):
this is pattern
TTG
['AAATTGACGCAT', 'GACGACCACGTT', 'CGTCAGCGCCTG', 'GCTGAGCACCGG', 'AGTACGGGACAG']
this is minHD and Motif
[0, 'TTG']
this is minHD and Motif
[2, 'ACG']
this is minHD and Motif
[1, 'CTG']
this is minHD and Motif
[1, 'CTG']
this is sum Of MinHD
6
sumHD= <class 'int'>
6
distance= <class 'int'>
4
distance is <=sumOfMinHD
this is MedianString
ACC
For the task, I created the method MedianString(seqLines,k)
. Initial values for were first assigned to a few variables in the code, and then a for loop was introduced. When I execute the function, Python appears to skip the lines preceding the loop and runs directly what is within the loop, as shown in this example. So I tried searching online for a possible reason of the problem and came across some comparable talks, but none of them seemed to apply to my situation. I'm completely lost...
Are you executing this in a notebook or as a standalone script? It seems like something is missing, as you have a return statement in the first three lines above your function definition. Also, basic python questions are best asked elsewhere (e.g. stackexchange).