Convert regex into DNAString object
0
1
Entering edit mode
2.2 years ago
elisheva ▴ 110

Hi,
I am trying to search for a pattern in a sequence in a way that a specific nucleotide won't be at the edges.
For example, given the following sequence:

x <- DNAString("TGCTTGCGCA")


I want to extract all the occurrences of GC where there is no T before or after.
Therefore only one occurrence will fit, since there are: TGCT, TGC and finally CGCA which indeed meets the condition.
In other words, the matching pattern is: {T}GC{T}
But I can't find any way to implement it using the Biostrings package.

I really hope you can help me figure it out.

R bioconductor Biostrings • 567 views
0
Entering edit mode

What is the problem with just converting the DNAString to a character and doing your regex with that?

0
Entering edit mode

Because I use StringSet and I want the analysis to be as fast as possible. If I will convert any single interval into character, I guess it will be much slower.

1
Entering edit mode

Traffic: 1691 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.