Question: Convert regex into DNAString object
gravatar for elisheva
12 months ago by
elisheva100 wrote:

I am trying to search for a pattern in a sequence in a way that a specific nucleotide won't be at the edges.
For example, given the following sequence:

x <- DNAString("TGCTTGCGCA")

I want to extract all the occurrences of GC where there is no T before or after.
Therefore only one occurrence will fit, since there are: TGCT, TGC and finally CGCA which indeed meets the condition.
In other words, the matching pattern is: {T}GC{T}
But I can't find any way to implement it using the Biostrings package.

I really hope you can help me figure it out.
Thanks for your help.

biostrings bioconductor R • 282 views
ADD COMMENTlink written 12 months ago by elisheva100

What is the problem with just converting the DNAString to a character and doing your regex with that?

ADD REPLYlink written 12 months ago by benformatics1.7k

Because I use StringSet and I want the analysis to be as fast as possible. If I will convert any single interval into character, I guess it will be much slower.

ADD REPLYlink modified 11 months ago • written 12 months ago by elisheva100

ADD REPLYlink written 12 months ago by ATpoint36k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1615 users visited in the last hour