Question

hexamer match recursive subsetting

0

Entering edit mode

3.6 years ago

alfonso • 0

I'm looking for specific hexamers in a set of target sequences DNAStringSet using Biostrings

I can find hexamer and then subset the original DNAStringSet and keep only those sequences that DO NOT have a match

hexamer1  <- DNAString("ATTAAA")
ATTAAA  <-unlist(vmatchPattern(hexamer1, target_seq))
target_seq_new  <- target_seq[!names(target_seq)  %in% names(ATTAAA),]

Then I want to start all over again with a new hexamer

hexamer2  <- DNAString("ATCTAA")
ATCTAA  <-unlist(vmatchPattern(hexamer2, target_seq_new))
target_seq_new  <- target_seq_new[!names(target_list)  %in% names(ATCTAA),]

How can I make this a single function that takes a list of hexamers and goes through all of them in a step-wise manner

 hexamers <- c("AAGAAA", "AATACA", "AATAGA", "AATATA", "AATGAA", "ACTAAA", "AGTAAA", "CATAAA", "GATAAA", "TATAAA", "TTTAAA")

I would like to have as output a list of hexamers like this one :

$AATAAA
IRanges object with 5966 ranges and 0 metadata columns:
                       start       end     width
                   <integer> <integer> <integer>
  FBgn0037332:TT05        29        34         6
  FBgn0011300:TT02        25        30         6
  FBgn0011300:TT02        39        44         6

$ATTAAA
IRanges object with 1375 ranges and 0 metadata columns:
                       start       end     width
                   <integer> <integer> <integer>
  FBgn0051619:TT03        42        47         6
  FBgn0010352:TT04        17        22         6
  FBgn0261822:TT05        10        15         6

$AATATA
IRanges object with 1267 ranges and 0 metadata columns:
                       start       end     width
                   <integer> <integer> <integer>
  FBgn0013272:TT02        42        47         6
  FBgn0013272:TT02        42        47         6
  FBgn0085391:TT04        11        16         6

Is there a function that I can use to do this recursive match?

R Biostrings hexamers sequence motif • 714 views

ADD COMMENT • link 3.6 years ago by alfonso • 0