Question: calculating overlaps with Peak calls for each exon
0
gravatar for pt.taklifi
4 months ago by
pt.taklifi60
pt.taklifi60 wrote:

I have a list of exons and a list of peak Calls in .txt format

Exons

 structure(list(chr1 = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "chr1", class = "factor"), 
    X3857280 = c(3858717L, 3865811L, 3867973L, 3869604L, 3872471L
    ), X3857717 = c(3858844L, 3866000L, 3868053L, 3869775L, 3872572L
    ), ENST00000378209.7_exon_0_0_chr1_3857281_f = structure(1:5, .Label = c("ENST00000378209.7_exon_1_0_chr1_3858718_f", 
    "ENST00000378209.7_exon_2_0_chr1_3865812_f", "ENST00000378209.7_exon_3_0_chr1_3867974_f", 
    "ENST00000378209.7_exon_4_0_chr1_3869605_f", "ENST00000378209.7_exon_5_0_chr1_3872472_f"
    ), class = "factor"), X0 = c(0L, 0L, 0L, 0L, 0L), X. = structure(c(1L, 
    1L, 1L, 1L, 1L), .Label = "+", class = "factor")), class = "data.frame", row.names = c(NA, 
-5L))

Peak Calls

structure(list(seqnames = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "chr1", class = "factor"), 
    start = c(975451L, 1014228L, 1290080L, 1291099L, 1291742L, 
    1327977L), end = c(975952L, 1014729L, 1290581L, 1291600L, 
    1292243L, 1328478L), name = structure(c(5L, 6L, 1L, 2L, 3L, 
    4L), .Label = c("BRCA_123", "BRCA_124", "BRCA_125", "BRCA_143", 
    "BRCA_39", "BRCA_55"), class = "factor"), score = c(1.87842575038562, 
    4.07469686212787, 2.44358820293876, 3.18019908767794, 8.26783029566134, 
    1.08246502080444), annotation = structure(c(1L, 1L, 1L, 1L, 
    1L, 1L), .Label = "3' UTR", class = "factor"), percentGC = c(0.6187624750499, 
    0.62874251497006, 0.678642714570858, 0.702594810379242, 0.640718562874252, 
    0.676646706586826), percentAT = c(0.3812375249501, 0.37125748502994, 
    0.321357285429142, 0.297405189620758, 0.359281437125749, 
    0.323353293413174)), class = "data.frame", row.names = c(NA, 
-6L))

so I for each exon I want to calculate if it overlaps with any of the peaks and if it does what percentage of exon is overlapping the peak AND if an exon overlaps more than one peak I want to report that then I want to store the results in a new table or data frame. other than a for loop I can't think of anything. specially since my data is rather big I'm looking for an efficient code. I'm currently working with R but I can do some coding in ubuntu terminal as well

exon bioconductor R overlap • 162 views
ADD COMMENTlink modified 4 months ago • written 4 months ago by pt.taklifi60

Fyi, if you have data in R and want to share in in an easy copy/paste fashion then use dput() on the object. It will create ASCII representation of the data that you can share here so users can quickly have your example data rather than typing them in. Use can use edit to add content to your post.

ADD REPLYlink written 4 months ago by ATpoint46k
1

Ok thanks for advice . I converted my data to ASCII format .

ADD REPLYlink modified 4 months ago • written 4 months ago by pt.taklifi60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1073 users visited in the last hour
_