Row-specific findOverlaps of two GRanges datasets
0
0
Entering edit mode
3.8 years ago
RogelioG • 0

Hi all,

I am working on a way to find sense/antisense encoded transcripts and the amount of overlap they have.

One final piece of code I cant get together: I have a two dataframes (GRanges) with EnsT ids and start and end position of each transcript, both for sense and antisene encoded transcript. From this I want to calculate the amount of overlap of the two corresponding sense/antisense transcripts that are in the same row.

Here comes the issue: findOverlaps searches all the data of a GRanges object, instead I need it to only search for overlap in the same row number for both GRanges object. Probably easy to do, but cant find a solution in the Documentation.

Thanks beforehand!

RG

GRanges R overlap • 1.8k views
ADD COMMENT
0
Entering edit mode

It would help to paste some sample data and to show the result that you want.

ADD REPLY
0
Entering edit mode

data and part of script I am running:

A <- makeGRangesFromDataFrame(a)
B <- makeGRangesFromDataFrame(b)

A             seqnames               ranges strand
                <Rle>            <IRanges>  <Rle>
12 ENSMUST00000138535 [61767414, 61851540]      *
21 ENSMUST00000016309 [74288246, 74304336]      *
25 ENSMUST00000027431 [86099036, 86111116]      *
26 ENSMUST00000027431 [86099036, 86111970]      *
28 ENSMUST00000113212 [87384576, 87394729]      *

B            seqnames               ranges strand
                <Rle>            <IRanges>  <Rle>
12 ENSMUST00000129030 [61638823, 61769859]      *
21 ENSMUST00000087226 [74285033, 74353692]      *
25 ENSMUST00000080204 [85600702, 85645036]      *
26 ENSMUST00000054279 [85649987, 85683118]      *
28 ENSMUST00000152501 [86021941, 86024669]      *

... ... ... .

olaps <- findOverlaps(A, B, select="all",type="equal",use.region="first")

isect <- pintersect(A[queryHits(olaps)], B[subjectHits(olaps)])

olaps2 <- data.frame(query=queryHits(olaps), subject=subjectHits(olaps),olap_width=width(isect))
ADD REPLY
0
Entering edit mode

Thanks! - if I just run findOverlaps with the default, I believe that it does what you need, no?

olaps <- findOverlaps(A, B)

isect <- pintersect(A[queryHits(olaps)], B[subjectHits(olaps)])
isect
GRanges object with 2 ranges and 1 metadata column:
      seqnames               ranges strand |       hit
         <Rle>            <IRanges>  <Rle> | <logical>
  [1]       12 [61767414, 61769859]      * |      TRUE
  [2]       21 [74288246, 74304336]      * |      TRUE
  -------
  seqinfo: 5 sequences from an unspecified genome; no seqlengths


olaps2 <- data.frame(query=queryHits(olaps), subject=subjectHits(olaps), olap_width=width(isect))
olaps2
  query subject olap_width
1     1       1       2446
2     2       2      16091

If It's not quite what you need, then just do some extra filtering on the olaps2 object, like:

olaps2[,olaps2$query==olaps2$subject]

This will show just the overlaps on the same row

ADD REPLY

Login before adding your answer.

Traffic: 2117 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6