Question: Avoid intersection of adjacent regions in GenomicRanges::intersect()
1
gravatar for david.mas
25 days ago by
david.mas20
david.mas20 wrote:

Hi,

I may have missed something but,

I have 2 ranges objects like this:

suppressPackageStartupMessages(library(GenomicRanges))
#> Warning: package 'S4Vectors' was built under R version 3.5.1

rg1 = GenomicRanges::GRanges(seqnames = c(1,1,2),
                             IRanges(start = c(1,10,1),
                                     end = c(9,19,20))
                             )
rg1
#> GRanges object with 3 ranges and 0 metadata columns:
#>       seqnames    ranges strand
#>          <Rle> <IRanges>  <Rle>
#>   [1]        1       1-9      *
#>   [2]        1     10-19      *
#>   [3]        2      1-20      *
#>   -------
#>   seqinfo: 2 sequences from an unspecified genome; no seqlengths

rg2 = GenomicRanges::GRanges(seqnames = c(1),
                             IRanges(start = c(5),
                                     end = c(15))
)

rg2
#> GRanges object with 1 range and 0 metadata columns:
#>       seqnames    ranges strand
#>          <Rle> <IRanges>  <Rle>
#>   [1]        1      5-15      *
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths

Then I want to intesect them by maintaining the adjacent intances in the original rg1 into different instances in the result object

The default behaviour merge them in the same instance:

rg_intersect = GenomicRanges::intersect(rg1,rg2)

rg_intersect
#> GRanges object with 1 range and 0 metadata columns:
#>       seqnames    ranges strand
#>          <Rle> <IRanges>  <Rle>
#>   [1]        1      5-15      *
#>   -------
#>   seqinfo: 2 sequences from an unspecified genome; no seqlengths

I tried to merge after applying a findOverlaps method, because the result of the FO method only gives 2 hits so I would expect that it also gives 2 instances. However, it also merges the 2 things.

findOverlapPairs(rg1,rg2)
#> Pairs object with 2 pairs and 0 metadata columns:
#>           first    second
#>       <GRanges> <GRanges>
#>   [1]     1:1-9    1:5-15
#>   [2]   1:10-19    1:5-15

GenomicRanges::intersect(findOverlapPairs(rg1,rg2))
#> GRanges object with 1 range and 0 metadata columns:
#>       seqnames    ranges strand
#>          <Rle> <IRanges>  <Rle>
#>   [1]        1      5-15      *
#>   -------
#>   seqinfo: 2 sequences from an unspecified genome; no seqlengths

Created on 2019-03-25 by the reprex package (v0.2.0).

R • 84 views
ADD COMMENTlink modified 25 days ago • written 25 days ago by david.mas20
2
gravatar for david.mas
25 days ago by
david.mas20
david.mas20 wrote:

Just found that I should be able to do it with the pintersect function from a Pair object. In my example it works but I am not familiar with this method so I am not sure if it would be general enough.

suppressPackageStartupMessages(library(GenomicRanges))
#> Warning: package 'S4Vectors' was built under R version 3.5.1

rg1 = GenomicRanges::GRanges(seqnames = c(1,1,2),
                             IRanges(start = c(1,10,1),
                                     end = c(9,19,20))
)

rg2 = GenomicRanges::GRanges(seqnames = c(1),
                             IRanges(start = c(5),
                                     end = c(15))
)

GenomicRanges::pintersect(findOverlapPairs(rg1,rg2))
#> GRanges object with 2 ranges and 1 metadata column:
#>       seqnames    ranges strand |       hit
#>          <Rle> <IRanges>  <Rle> | <logical>
#>   [1]        1       5-9      * |      TRUE
#>   [2]        1     10-15      * |      TRUE
#>   -------
#>   seqinfo: 2 sequences from an unspecified genome; no seqlengths

Created on 2019-03-25 by the reprex package (v0.2.0).

ADD COMMENTlink written 25 days ago by david.mas20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1524 users visited in the last hour