KMP algorithm's time complexity
2
0
Entering edit mode
10 weeks ago

I'm attempting to implement strstr using the KMP method. This is the algorithm as described in Wikipedia. The temporal complexity of the KMP algorithm is O(n), where n is the length of the bigger string.

vector<int> KMP(string S, string K)
{
vector<int> T(K.size() + 1, -1);
vector<int> matches;

if(K.size() == 0)
{
matches.push_back(0);
return matches;
}
for(int i = 1; i <= K.size(); i++)
{
int pos = T[i - 1];
while(pos != -1 && K[pos] != K[i - 1]) pos = T[pos];
T[i] = pos + 1;
}

int sp = 0;
int kp = 0;
while(sp < S.size())
{
while(kp != -1 && (kp == K.size() || K[kp] != S[sp])) kp = T[kp];
kp++;
sp++;
if(kp == K.size()) matches.push_back(sp - K.size());
}

return matches;
}


I'm not sure how the complexity of this algorithm is O. (n). Can somebody explain how this code's complexity is O(n)?

algorithm • 611 views
0
Entering edit mode

How is this related to bioinformatics?

0
Entering edit mode

Follow up question: Why is this post almost a word-for-word copy of https://forum.wearedevs.net/t/29958?

1
Entering edit mode
10 weeks ago
lennykovac ▴ 90

The KMP algorithm searches a pattern P of length m in a query Q of length n, then the algorithm compares worst-case m * (n - m +1) elements.

Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation. The letter O is used because the growth rate of a function is also referred to as the order of the function. WIKI

EDIT: As Rob pointed out this would result in a complexity of O(n*m)! And this is indeed not linear. But there are implementations that scale it down to O(n+m) which is in O(n).

Here you can find the most common time-complexities

4
Entering edit mode

This isn't the case. That is, m (n - m + 1) is not in O(n). In fact, KMP is not actually O(n), but O(m + n). It takes time linear in the size of the pattern (O(m)) to build the partial match table, and then O(n) time given the query and partial match table to enumerate the occurrences. Performing m (n-m+1) comparisons would make the algorithm O(m*n), which is, in fact, the asymptotic complexity of "naive" substring search.

0
Entering edit mode
6 weeks ago
umang • 0

Not sure if this is the right forum to ask about KMP. Anyways you can read https://www.w3spot.com/2020/07/kmp-algorithm-explained-in-plain-english.html article. It explains time complexity of KMP algorithm in detail.