Longest common substring between two sequences with minimum error correction
0
0
Entering edit mode
2.8 years ago
Alex • 0

I have two strings A and B. I want to find the longest common substring between these two strings while allowing minimum error correction in B or A. The errors are in the form of adjacent swaps, for example s_1s_2...s_i s_{i+1}...s_n is one swap away from the string s_1s_2...s_{i+1} s_{i}...s_n. Is there any dynamic programming solution for it?

The score for matching a letter in two strings is M and the penalty for swapping is P. So in total the objective is maximizing the total M - P. Obviously If I swap two adjacent letters in A and there is a matched in B for these letters, then the score will increase by 2M - P.

I have written a recursive code in Python for calculating the LCS but I don't know how to add the swap correction.

M = 1 # score for matching a letter in two strings
P = 1 # penalty for swapping

def lcs(A, B, i, j, res):
     if i == -1 or j == -1:
         return res
     if A[i] == B[j]:
         res = lcs(A, B, i - 1, j - 1, res + M)
     return max(res, max(lcs(A, B, i, j - 1, 0), lcs(A, B, i - 1, j, 0)))


>>> A = "fghijklmnopq"
>>> B = "klmnopqrstuv"
>>> res = lcs(A, B, len(A) - 1, len(B) - 1, 0)
>>> print(res)
7
algorithm • 402 views
ADD COMMENT

Login before adding your answer.

Traffic: 1999 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6