Question: Convert regex into DNAString object
gravatar for elisheva
6 weeks ago by
elisheva80 wrote:

I am trying to search for a pattern in a sequence in a way that a specific nucleotide won't be at the edges.
For example, given the following sequence:

x <- DNAString("TGCTTGCGCA")

I want to extract all the occurrences of GC where there is no T before or after.
Therefore only one occurrence will fit, since there are: TGCT, TGC and finally CGCA which indeed meets the condition.
In other words, the matching pattern is: {T}GC{T}
But I can't find any way to implement it using the Biostrings package.

I really hope you can help me figure it out.
Thanks for your help.

biostrings bioconductor R • 106 views
ADD COMMENTlink written 6 weeks ago by elisheva80

What is the problem with just converting the DNAString to a character and doing your regex with that?

ADD REPLYlink written 6 weeks ago by benformatics1.1k

Because I use StringSet and I want the analysis to be as fast as possible. If I will convert any single interval into character, I guess it will be much slower.

ADD REPLYlink modified 4 weeks ago • written 6 weeks ago by elisheva80

ADD REPLYlink written 6 weeks ago by ATpoint23k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 747 users visited in the last hour