hexamer match recursive subsetting
Entering edit mode
9 months ago
alfonso • 0

I'm looking for specific hexamers in a set of target sequences DNAStringSet using Biostrings

I can find hexamer and then subset the original DNAStringSet and keep only those sequences that DO NOT have a match

hexamer1  <- DNAString("ATTAAA")
ATTAAA  <-unlist(vmatchPattern(hexamer1, target_seq))
target_seq_new  <- target_seq[!names(target_seq)  %in% names(ATTAAA),]

Then I want to start all over again with a new hexamer

hexamer2  <- DNAString("ATCTAA")
ATCTAA  <-unlist(vmatchPattern(hexamer2, target_seq_new))
target_seq_new  <- target_seq_new[!names(target_list)  %in% names(ATCTAA),]

How can I make this a single function that takes a list of hexamers and goes through all of them in a step-wise manner


I would like to have as output a list of hexamers like this one :

IRanges object with 5966 ranges and 0 metadata columns:
                       start       end     width
                   <integer> <integer> <integer>
  FBgn0037332:TT05        29        34         6
  FBgn0011300:TT02        25        30         6
  FBgn0011300:TT02        39        44         6

IRanges object with 1375 ranges and 0 metadata columns:
                       start       end     width
                   <integer> <integer> <integer>
  FBgn0051619:TT03        42        47         6
  FBgn0010352:TT04        17        22         6
  FBgn0261822:TT05        10        15         6

IRanges object with 1267 ranges and 0 metadata columns:
                       start       end     width
                   <integer> <integer> <integer>
  FBgn0013272:TT02        42        47         6
  FBgn0013272:TT02        42        47         6
  FBgn0085391:TT04        11        16         6

Is there a function that I can use to do this recursive match?

R Biostrings hexamers sequence motif • 218 views

Login before adding your answer.

Traffic: 1703 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6