Question: Row-specific findOverlaps of two GRanges datasets
0
gravatar for RogelioG
5 months ago by
RogelioG0
RogelioG0 wrote:

Hi all,

I am working on a way to find sense/antisense encoded transcripts and the amount of overlap they have.

One final piece of code I cant get together: I have a two dataframes (GRanges) with EnsT ids and start and end position of each transcript, both for sense and antisene encoded transcript. From this I want to calculate the amount of overlap of the two corresponding sense/antisense transcripts that are in the same row.

Here comes the issue: findOverlaps searches all the data of a GRanges object, instead I need it to only search for overlap in the same row number for both GRanges object. Probably easy to do, but cant find a solution in the Documentation.

Thanks beforehand!

RG

granges R overlap • 206 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by RogelioG0

It would help to paste some sample data and to show the result that you want.

ADD REPLYlink written 5 months ago by Kevin Blighe21k

data and part of script I am running:

A <- makeGRangesFromDataFrame(a)
B <- makeGRangesFromDataFrame(b)

A             seqnames               ranges strand
                <Rle>            <IRanges>  <Rle>
12 ENSMUST00000138535 [61767414, 61851540]      *
21 ENSMUST00000016309 [74288246, 74304336]      *
25 ENSMUST00000027431 [86099036, 86111116]      *
26 ENSMUST00000027431 [86099036, 86111970]      *
28 ENSMUST00000113212 [87384576, 87394729]      *

B            seqnames               ranges strand
                <Rle>            <IRanges>  <Rle>
12 ENSMUST00000129030 [61638823, 61769859]      *
21 ENSMUST00000087226 [74285033, 74353692]      *
25 ENSMUST00000080204 [85600702, 85645036]      *
26 ENSMUST00000054279 [85649987, 85683118]      *
28 ENSMUST00000152501 [86021941, 86024669]      *

... ... ... .

olaps <- findOverlaps(A, B, select="all",type="equal",use.region="first")

isect <- pintersect(A[queryHits(olaps)], B[subjectHits(olaps)])

olaps2 <- data.frame(query=queryHits(olaps), subject=subjectHits(olaps),olap_width=width(isect))
ADD REPLYlink modified 5 months ago • written 5 months ago by RogelioG0

Thanks! - if I just run findOverlaps with the default, I believe that it does what you need, no?

olaps <- findOverlaps(A, B)

isect <- pintersect(A[queryHits(olaps)], B[subjectHits(olaps)])
isect
GRanges object with 2 ranges and 1 metadata column:
      seqnames               ranges strand |       hit
         <Rle>            <IRanges>  <Rle> | <logical>
  [1]       12 [61767414, 61769859]      * |      TRUE
  [2]       21 [74288246, 74304336]      * |      TRUE
  -------
  seqinfo: 5 sequences from an unspecified genome; no seqlengths


olaps2 <- data.frame(query=queryHits(olaps), subject=subjectHits(olaps), olap_width=width(isect))
olaps2
  query subject olap_width
1     1       1       2446
2     2       2      16091

If It's not quite what you need, then just do some extra filtering on the olaps2 object, like:

olaps2[,olaps2$query==olaps2$subject]

This will show just the overlaps on the same row

ADD REPLYlink written 5 months ago by Kevin Blighe21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1993 users visited in the last hour