Exact pattern matching: why is it more specific to look for a non-consecutive subsequence?
0
0
Entering edit mode
5 days ago
Bethan • 0

I'm doing an online Bioinformatics course and the following example was given but not really explained.

If you're searching for an exact pattern P in text T, e.g.:

T = CGTGCGTGCTT...(etc.)
P = GCGTACT

It is apparently more specific to look for a non-consecutive subsequence of P in T, than a consecutive substring of P in T.

In other words, searching for this exact subsequence of P within T...

P = GC_T_C_

is supposed to give more specific results than searching for this one:

P = GCGT___

The question is: is that really true and if so, why? Either way the program is looking for 4 bases and I'm assuming those bases have a roughly equal chance of being A, C, T or G (ignoring GC skew). Therefore, shouldn't searching for either pattern in T be equally specific?

matching exact indexing patterns algorithms • 74 views
ADD COMMENT

Login before adding your answer.

Traffic: 2271 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6