Question: How to obtain distinct/uniqe rows from GenomicRanges object
1
gravatar for gundalav
12 months ago by
gundalav300
La La Land
gundalav300 wrote:

I have the following GenomicRanges object created with this:

library(GenomicRanges)
gr <- GRanges(seqnames = "chr1", strand = c("+", "-","-", "+"),ranges = IRanges(start = c(1,3,3,5), width = 3))
gr

That looks like this:

GRanges object with 4 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr1       1-3      +
  [2]     chr1       3-5      -
  [3]     chr1       3-5      -
  [4]     chr1       5-7      +

What I want to do is to obtain the unique rows from there, yielding this (hand-coded)

GRanges object with 3 ranges and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr1       1-3      +
  [2]     chr1       3-5      -
  [3]     chr1       5-7      +

How can I achieve that? In reality, I have around 9 million rows to process.

I can use this method but very2 slow:

 library(tidyverse)
 gr %>% 
   as.tibble() %>% 
   distinct()
genomicrange bioconductor R • 566 views
ADD COMMENTlink modified 12 months ago by zx87549.3k • written 12 months ago by gundalav300
1
gravatar for zx8754
12 months ago by
zx87549.3k
London
zx87549.3k wrote:

Use unique as usual (no need for tidyverse):

unique(gr)
# GRanges object with 3 ranges and 0 metadata columns:
#       seqnames    ranges strand
#          <Rle> <IRanges>  <Rle>
#   [1]     chr1       1-3      +
#   [2]     chr1       3-5      -
#   [3]     chr1       5-7      +
#   -------
#   seqinfo: 1 sequence from an unspecified genome; no seqlengths

Then convert to data.frame if needed:

data.frame(unique(gr))
#     seqnames start end width strand
#   1     chr1     1   3     3      +
#   2     chr1     3   5     3      -
#   3     chr1     5   7     3      +
ADD COMMENTlink modified 12 months ago • written 12 months ago by zx87549.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1565 users visited in the last hour