Question: Partial or complete overlap of two genomic ranges
1
gravatar for Jimbou
6.0 years ago by
Jimbou710
Germany
Jimbou710 wrote:

Hello, 

I need your help. I want to compare two dataframes with genomic features like this:

More precisely, frame 2 compared to frame 1

chr start end name        chr start end   name  
1    2    5   A           1    1    3       AA
2   10    15  B           2    9    16      BB
3   27    30  C           3    28   29      CC

As a result I want this:

1. Completely overlapping:

chr start end name    chr start end name
2   10    15   B       2    9    16    BB

2. Partial overlap:

chr start end  name    chr start end   name 
1    2    5    A        1    1    3    AA

3. and within the range:

chr start end name        chr start end   name 
3   27    30  C            3    28   29    CC

 

Are the R packages IRanges and GenomicRanges suitable for such analysis? Or do I have to write some > < commands?

grange sequence R • 7.4k views
ADD COMMENTlink modified 5.0 years ago by Biostar ♦♦ 20 • written 6.0 years ago by Jimbou710
12
gravatar for komal.rathi
5.3 years ago by
komal.rathi3.5k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.5k wrote:

Using the GenomicRanges library in R:

library(GenomicRanges)
x1 = read.table(text="chr start end name  
1    2    5   A
2   10    15  B
3   27    30  C",header=T)

x2 = read.table(text="chr start end   name  
1   1    1    3       AA
2   2    9    16      BB
3   3    28   29      CC",header=T)

# Make GRanges object
gr1 = with(x1, GRanges(chr, IRanges(start = start, end = end, names = name)))
gr2 = with(x2, GRanges(chr, IRanges(start = start, end = end, names = name)))

# Completely overlapping
type1 = findOverlaps(query = gr1, subject = gr2, type = "within")
type1.df = data.frame(x1[queryHits(type1),], x2[subjectHits(type1),])
type1.df
  chr start end name chr.1 start.1 end.1 name.1
2   2    10  15    B     2       9    16     BB

# Within range
type3 = findOverlaps(query = gr2, subject = gr1, type = "within")
type3.df = data.frame(x1[subjectHits(type3),], x2[queryHits(type3),])
type3.df
  chr start end name chr.1 start.1 end.1 name.1
3   3    27  30    C     3      28    29     CC

# Partial Overlaps only (no complete overlaps or within range overlaps)
type2 = findOverlaps(query = gr1, subject = gr2, type = 'any')
type2.df = data.frame(x1[queryHits(type2),], x2[subjectHits(type2),])
x = rbind(type1.df, type2.df, type3.df)
type2.df = x[!(duplicated(x) | duplicated(x, fromLast = TRUE)), ]
type2.df
   chr start end name chr.1 start.1 end.1 name.1
21   1     2   5    A     1       1     3     AA

You will get three different data frames as per your question. Based on your example, I am assuming type 1 and type 3 are opposites.

ADD COMMENTlink modified 5.3 years ago • written 5.3 years ago by komal.rathi3.5k

Extremly useful add-on

http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/My_R_Scripts/rangeoverlapper.R

 

 

ADD REPLYlink written 5.1 years ago by Jimbou710

How do I find out intergenic locations?

ADD REPLYlink written 19 months ago by Paul80
0
gravatar for Alex Reynolds
6.0 years ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

Take a look at BEDOPS bedmap --fraction-map, which allows you to recover map elements (those on your right-hand side) which overlap reference elements (those on your left-hand side) by some fractional value between 0 and 1, inclusive (i.e. between 0 and 100%).

For instance, if your reference and map data sets are sorted BED files called A and B, respectively, then you could do:

$ bedmap --echo --echo-map --fraction-map 1 A B

to get all elements from set B that completely overlap an element in set A.

These tools can run from the command line or within R, via system() calls.

ADD COMMENTlink modified 6.0 years ago • written 6.0 years ago by Alex Reynolds29k
0
gravatar for EagleEye
5.3 years ago by
EagleEye6.6k
Sweden
EagleEye6.6k wrote:
You could also try GenGen: http://www.openbioinformatics.org/gengen/tutorial_scan_region.html
ADD COMMENTlink written 5.3 years ago by EagleEye6.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1787 users visited in the last hour