Question: Accessing start positions in a strand-specific manner from a GRanges object
gravatar for R. Taylor Raborn
5.7 years ago by
Tempe, AZ | Biodesign Institute at ASU
R. Taylor Raborn290 wrote:

Hi all:

I have a little bit of experience working with GRanges objects in R (from the GenomicRanges package in Bioconductor), but I keep running into a subsetting case that should be more straightforward than the solution I'm using. 

Let's say I have the following GRanges object (using the example from the reference):

gr2 <- GRanges(seqnames = c("chr1", "chr1"), 
ranges = IRanges(c(7,13), width = 3), strand = c("+", "-")) #sample GRanges object

...which looks like this:

> gr2

GRanges object with 2 ranges and 0 metadata columns:

      seqnames    ranges strand

         <Rle> <IRanges>  <Rle>

  [1]     chr1  [ 7,  9]      +

  [2]     chr1  [13, 15]      -


  seqinfo: 1 sequence from an unspecified genome; no seqlengths

I'd like to access all start positions in this object in a strand-specific manner, where I define the "start" to be first value in the IRanges interval if it's on the plus strand, and the second value if it's the negative strand.  Of course, the behavior of the subsetting methods start() and end() are both agnostic to strand, grabbing all values in the first or second value of the interval, respectively.

For example:

> start(gr2)

[1]  7 13


> end(gr2)

[1]  9 15

My current work-around (which is ugly) looks something like the following:

>  which(strand(gr2)=="+") -> plus.i #which intervals are on the positive strand?

 > start(gr2[plus.i]) #getting the strand-specific 'start' from those intervals
[1] 7
>  which(strand(gr2)=="-") -> minus.i #which intervals are on the negative strand?
 > end(gr2[minus.i]) #getting the strand-specific 'start' from those intervals

[1] 15

I then concatenate both sets of vectors using c(). 

There must an easier, more GRanges-centric approach to access these strand-specific 'starts'. Can anyone point me in the right direction? The real-world application case I'm dealing with are alignments of mapped, strand-specific CAGE tags. The 5' ends of an interval represents the TSS.

Thanks in advance,


genomicranges bioconductor R • 3.2k views
ADD COMMENTlink modified 4.2 years ago by pariksheet.nanda80 • written 5.7 years ago by R. Taylor Raborn290
gravatar for pariksheet.nanda
4.2 years ago by
pariksheet.nanda80 wrote:

Easier would be to use start(resize(gr2, 1)).

The resize(gr2, width = 1) honors strand specific behavior, and setting a width of 1 makes both the "start" and "end" equal to the strand-specific start position.

ADD COMMENTlink written 4.2 years ago by pariksheet.nanda80

Ha, I was about to write custom code, this is way smarter, thanks!

ADD REPLYlink written 13 hours ago by ATpoint44k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1455 users visited in the last hour