I am trying to search for a pattern in a sequence in a way that a specific nucleotide won't be at the edges.
For example, given the following sequence:
x <- DNAString("TGCTTGCGCA")
I want to extract all the occurrences of GC where there is no T before or after.
Therefore only one occurrence will fit, since there are: TGCT, TGC and finally CGCA which indeed meets the condition.
In other words, the matching pattern is:
But I can't find any way to implement it using the Biostrings package.
I really hope you can help me figure it out.
Thanks for your help.