Biostrings: Subsetting Vpatternmatch On Xstringset-Class
0
0
Entering edit mode
10.3 years ago
Timtico ▴ 330

I would like to get all the substrings of a pattern match on a XStringSet-class. I now use the following code, but this ignores multiple matches and I have the feeling there is a better way to do it that uses biostrings functions.

I load a fasta file into a XStringSet-class object and then search for a specific string using the vmatchPattern function:

genes <- readDNAStringSet(File = "filename", format = "fasta", use.names = T)
view <- vmatchPattern(pattern = "CCGGA", genes)
matches <- unlist(view, recursive = T, use.names = T)
m <- as.matrix(matches)

I retrieve a substring starting at the match and 20 positions upward:

subseq(genes[rownames(m),], start = m[rownames(m),1], width = 20)

What is a better way to do this that includes all possible matches and using Biostrings functions?

r bioconductor • 2.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 1709 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6