Question

How do I summarize a GRanges data frame into "one complete RLE"?

0

Entering edit mode

9.9 years ago

kamaitachi • 0

I have a GRanges data frame corresponding to a mappability track that looks like this:

> m
GRanges with 31194271 ranges and 1 metadata column:
                              seqnames         ranges strand   | mappable
                                 <Rle>      <IRanges>  <Rle>   |    <Rle>
         [1]                         4   [5981, 5985]      *   |    FALSE
         [2]                         4   [5986, 5990]      *   |    FALSE
         [3]                         4   [5991, 5995]      *   |    FALSE
         [4]                         4   [5996, 6000]      *   |    FALSE
         [5]                         4   [6001, 6005]      *   |    FALSE
         ...                       ...            ...    ... ...      ...
  [31194267] dmel_mitochondrion_genome [19496, 19500]      *   |    FALSE
  [31194268] dmel_mitochondrion_genome [19501, 19505]      *   |    FALSE
  [31194269] dmel_mitochondrion_genome [19506, 19510]      *   |    FALSE
  [31194270] dmel_mitochondrion_genome [19511, 19515]      *   |    FALSE
  [31194271] dmel_mitochondrion_genome [19516, 19517]      *   |    FALSE

How can I summarize the ranges so that, for example, the region of chromosome 4, Ranges 5981-6005 get summarized into one line of FALSEs?

RLE GRanges GenomicRanges R • 3.2k views

ADD COMMENT • link updated 2.6 years ago by Ram 43k • written 9.9 years ago by kamaitachi • 0

score 4 · Accepted Answer · 2014-05-22

4

Entering edit mode

9.9 years ago

Devon Ryan 104k

reduce(m)

Edit: I guess you want the extra columns too. In that case it's a bit more complicated.

m2 <- reduce(m)
IDX <- findOverlaps(m, m2)
IDX2 <- IDX[which(!duplicated(subjectHits(IDX))),] #Just assign things once
mcols(m2)$mappable[subjectHits(IDX2)] <- mcols(m)$mappable[queryHits(IDX2)]

or something quite close to that.

ADD COMMENT • link 9.9 years ago by Devon Ryan 104k

0

Entering edit mode

Of course. It's always a one-liner. Thanks very much! :)

ADD REPLY • link 9.9 years ago by kamaitachi • 0

0

Entering edit mode

Note the update! I'd forgotten about the metadata columns, which you want to keep. There's no inbuilt way to get reduce() to keep those, so I just assign the first value of the original object. One could think of more complicated ways to do that, likely by splitting the output by subjectHits() and then applying a function.

ADD REPLY • link 9.9 years ago by Devon Ryan 104k