Question: How to facilitate the performance of overlap for GRanges objects (verifiable example is given)?
0
gravatar for Jurat Shahidin
3.4 years ago by
Chicago, IL, USA
Jurat Shahidin80 wrote:

Hi every one:

I have GRanges objects to find overlap simultaneously, I did implement function to accomplish this task, it works good so far. Now I need to figure out better way to facilitate the process, and needs to come up more compatible function to improve the re-usability of my code. However, I tried my attempt to get better function for it, but I don't happy with my current function performance. Therefore, I came for this community to get some ideas to achieve my desired output.

Here is the reproducible example to understand the problem that I faced:

data:

gr1 <- GRanges(
      seqnames=Rle(c("chr1", "chr2", "chr3", "chr4"), c(5, 6, 4, 3)),
      ranges=IRanges(seq(1, by=9, len=18), seq(6, by=9, len=18)),
      rangeName=letters[seq(1:18)], score=sample(1:25, 18, replace = FALSE))

gr2<- GRanges(
  seqnames=Rle(c("chr1", "chr2", "chr3","chr4"), c(4, 7, 5, 4)),
  ranges=IRanges(seq(2, by=11, len=20), seq(8, by=11, len=20)),
  rangeName=letters[seq(1:20)], score=sample(1:25, 20, replace = FALSE))
gr3 <- GRanges(
  seqnames=Rle(c("chr1", "chr2", "chr3","chr4"), c(8, 7, 6, 4)),
  ranges=IRanges(seq(4, by=11, len=25), seq(9, by=11, len=25)),
  rangeName=letters[seq(1:25)], score=sample(1:25, 25, replace = FALSE))

step 1:

library(magrittr)
gr1.ss <- gr1 %>% subset(score > 12) 
gr1.ww <- gr1 %>% subset(score >5 & score <=12)
gr1.nn <- gr1 %>% subset(score <= 5)

 targets <- list(gr2, gr3)

step 2 :

ov.ss <- lapply(targets, function(ele_) {
  res <- as(findOverlaps(gr1.ss, ele_), "List")
})
ov.ww <- lapply(targets, function(ele_) {
  res <- as(findOverlaps(gr1.ww, ele_), "List")
})

this is my attempt to facilitate ov.ss, ov.ww can happen one wrapper function, so I can loop through gr1.res by finding match of its name, and internally call ov .

step 3 - my attempt (need to be improved) :

gr1.res <- list('ss'=gr1.ss, 'ww'=gr1.ww, 'noise'=gr1.nn)
for(i in 1:length((gr1.res))) {
  query <- S4Vectors::getListElement(gr1.res, i)
  if(names(query)=='nn') {
     message("do not process, and directly export as data.frame")
     out <- unique(query) %>% as.data.frame
  }
  ov <- lapply(targets, function(ele_) {
      res <- as(findOverlaps(query, ele_), "List")
    })
  ov
}

I don't want to process gr1.nn and just directly export them as data.frame objects. I want to integrate doing ov.ss and ov.ww in one wrapper functions, so I want to come up new way to do this. How can I facilitate my attempt solution? How can I have more compatible function for facilitating my desired attempt. I will be grateful if any one can give me any enlightening solution, possible suggestion or ideas to achieve my desired functions. Thanks a lot

Best regards:

Jurat

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by Jurat Shahidin80
3
gravatar for Giovanni M Dall'Olio
3.4 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

What about the following:

> intersect(intersect(gr1, gr2), gr3)
GRanges object with 6 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr1  [ 4,  6]      *
  [2]     chr1  [15, 15]      *
  [3]     chr1  [19, 19]      *
  [4]     chr1  [28, 30]      *
  [5]     chr1  [37, 41]      *
  [6]     chr2  [92, 96]      *
  -------
  seqinfo: 4 sequences from an unspecified genome; no seqlengths

Just to make sure that the order of the intersections does not matter:

> intersect(intersect(gr3, gr2), gr1)
GRanges object with 6 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr1  [ 4,  6]      *
  [2]     chr1  [15, 15]      *
  [3]     chr1  [19, 19]      *
  [4]     chr1  [28, 30]      *
  [5]     chr1  [37, 41]      *
  [6]     chr2  [92, 96]      *
  -------
  seqinfo: 4 sequences from an unspecified genome; no seqlengths
ADD COMMENTlink written 3.4 years ago by Giovanni M Dall'Olio26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1003 users visited in the last hour