Question: GRanges intersect operation
0
gravatar for bsmith030465
2.8 years ago by
bsmith030465140
United States
bsmith030465140 wrote:

I am trying to find the intersection of two GRanges objects. The first object has the coordinates for genes (grgenes below), the second mimics my data (grtest below). However, when I try to do the intersection, I appear to get an empty set. What am I doing wrong? (Sorry for the bad formatting! Can I turn the 'auto-format' off??)

grgenes

GRanges object with 23056 ranges and 1 metadata column:

    seqnames                 ranges strand   |       GENEID
       <Rle>              <IRanges>  <Rle>   | <FactorList>
  1    chr19 [ 58858172,  58874214]      -   |            1
 10     chr8 [ 18248755,  18258723]      +   |           10
100    chr20 [ 43248163,  43280376]      -   |          100

seqinfo: 93 sequences (1 circular) from hg19 genome

grtest <- GRanges(seqnames = "chr19",ranges=IRanges(start=c(58857500,58857505),width=1))

grtest

GRanges object with 2 ranges and 0 metadata columns:

  seqnames               ranges strand
     <Rle>            <IRanges>  <Rle>

[1] chr19 [58857500, 58857500] *

[2] chr19 [58857505, 58857505] *


seqinfo: 1 sequence from an unspecified genome; no seqlengths

intersect(grgenes,grtest)

GRanges object with 0 ranges and 0 metadata columns:

seqnames ranges strand <rle> <iranges> <rle>


seqinfo: 93 sequences (1 circular) from hg19 genome

granges • 4.0k views
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by bsmith030465140
1
gravatar for russhh
2.8 years ago by
russhh4.3k
UK, U. Glasgow
russhh4.3k wrote:

Since grgene contains strand information, you need to either add ignore.strand = TRUE , or set the strandedness of your test data. You also need to ensure that your test data actually overlaps with your intiial data ([58857500, ..] doesn't overlap [58858172, 58874214]). I've added a couple of example GRanges that should definitely overlap your grgene ranges in the following:

grgene <- GRanges(
    seqnames = c('chr19', 'chr8', 'chr20'),
    ranges = IRanges(start = c(58858172, 18248755, 43248163),
                       end = c(58874214, 18258723, 43280376)),
    strand = c('-', '+', '-')
    ) # ignoring the GENEID column

grtest.unstrand <- GRanges(
    seqnames = "chr19",
    ranges=IRanges(
        start=c(58857500,58857505,58858500,58859505), # note the difference
        width=1))

grtest.strand <- GRanges(
    seqnames = "chr19",
    ranges=IRanges(
        start=c(58857500,58857505,58858500,58859505),
        width=1),
    strand = c('-', '+', '-', '+'))

> GenomicRanges::intersect(grgene, grtest.unstrand)
GRanges object with 0 ranges and 0 metadata columns:
   seqnames    ranges strand
      <Rle> <IRanges>  <Rle>
  -------
  seqinfo: 3 sequences from an unspecified genome; no seqlengths

> GenomicRanges::intersect(grgene, grtest.unstrand, ignore.strand = TRUE)
GRanges object with 2 ranges and 0 metadata columns:
      seqnames               ranges strand
         <Rle>            <IRanges>  <Rle>
  [1]    chr19 [58858500, 58858500]      *
  [2]    chr19 [58859505, 58859505]      *
  -------
  seqinfo: 3 sequences from an unspecified genome; no seqlengths

> GenomicRanges::intersect(grgene, grtest.strand)
GRanges object with 1 range and 0 metadata columns:
      seqnames               ranges strand
         <Rle>            <IRanges>  <Rle>
  [1]    chr19 [58858500, 58858500]      -
  -------
  seqinfo: 3 sequences from an unspecified genome; no seqlengths

> GenomicRanges::intersect(grgene, grtest.strand, ignore.strand = TRUE)
GRanges object with 2 ranges and 0 metadata columns:
      seqnames               ranges strand
         <Rle>            <IRanges>  <Rle>
  [1]    chr19 [58858500, 58858500]      *
  [2]    chr19 [58859505, 58859505]      *
  -------
  seqinfo: 3 sequences from an unspecified genome; no seqlengths

HTH

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by russhh4.3k
0
gravatar for bsmith030465
2.8 years ago by
bsmith030465140
United States
bsmith030465140 wrote:

Thanks!

Also, how can I get the meta-data values to show in the intersect results. For example, with my original grgenes object:

intersect(grgenes,grtest.unstrand,ignore.strand=TRUE)

GRanges object with 2 ranges and 0 metadata columns:

  seqnames               ranges strand
     <Rle>            <IRanges>  <Rle>

[1] chr19 [58858500, 58858500] *

[2] chr19 [58859505, 58859505] *


seqinfo: 93 sequences (1 circular) from hg19 genome

How can I get the gene ids to show in the meta-data column?

ADD COMMENTlink written 2.8 years ago by bsmith030465140

Unfortunately, that won't be possible in GenomicRanges::intersect, since the metadata for a supersequence doesn't necessarily correspond to a subsequence. You can however, use findOverlaps to identify rows within your grgenes object that are overlapped by at least one entry of grtest, and then extract those rows. Look at the documentation

ADD REPLYlink written 2.8 years ago by russhh4.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1018 users visited in the last hour