Exact pattern matching: why is it more specific to look for a non-consecutive subsequence?
Entering edit mode
5 days ago
Bethan • 0

I'm doing an online Bioinformatics course and the following example was given but not really explained.

If you're searching for an exact pattern P in text T, e.g.:

T = CGTGCGTGCTT...(etc.)

It is apparently more specific to look for a non-consecutive subsequence of P in T, than a consecutive substring of P in T.

In other words, searching for this exact subsequence of P within T...

P = GC_T_C_

is supposed to give more specific results than searching for this one:

P = GCGT___

The question is: is that really true and if so, why? Either way the program is looking for 4 bases and I'm assuming those bases have a roughly equal chance of being A, C, T or G (ignoring GC skew). Therefore, shouldn't searching for either pattern in T be equally specific?

matching exact indexing patterns algorithms • 74 views

Login before adding your answer.

Traffic: 2271 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6