Question: Convert regex into DNAString object
1
gravatar for elisheva
6 weeks ago by
elisheva80
Israel
elisheva80 wrote:

Hi,
I am trying to search for a pattern in a sequence in a way that a specific nucleotide won't be at the edges.
For example, given the following sequence:

x <- DNAString("TGCTTGCGCA")

I want to extract all the occurrences of GC where there is no T before or after.
Therefore only one occurrence will fit, since there are: TGCT, TGC and finally CGCA which indeed meets the condition.
In other words, the matching pattern is: {T}GC{T}
But I can't find any way to implement it using the Biostrings package.

I really hope you can help me figure it out.
Thanks for your help.

biostrings bioconductor R • 106 views
ADD COMMENTlink written 6 weeks ago by elisheva80

What is the problem with just converting the DNAString to a character and doing your regex with that?

ADD REPLYlink written 6 weeks ago by benformatics1.1k

Because I use StringSet and I want the analysis to be as fast as possible. If I will convert any single interval into character, I guess it will be much slower.

ADD REPLYlink modified 4 weeks ago • written 6 weeks ago by elisheva80
1

https://support.bioconductor.org/p/121676/

ADD REPLYlink written 6 weeks ago by ATpoint23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 747 users visited in the last hour