Question: calculating overlaps with Peak calls for each exon
0
gravatar for pt.taklifi
4 weeks ago by
pt.taklifi10
pt.taklifi10 wrote:

I have a list of exons and a list of peak Calls in .txt format

Exons

 structure(list(chr1 = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "chr1", class = "factor"), 
    X3857280 = c(3858717L, 3865811L, 3867973L, 3869604L, 3872471L
    ), X3857717 = c(3858844L, 3866000L, 3868053L, 3869775L, 3872572L
    ), ENST00000378209.7_exon_0_0_chr1_3857281_f = structure(1:5, .Label = c("ENST00000378209.7_exon_1_0_chr1_3858718_f", 
    "ENST00000378209.7_exon_2_0_chr1_3865812_f", "ENST00000378209.7_exon_3_0_chr1_3867974_f", 
    "ENST00000378209.7_exon_4_0_chr1_3869605_f", "ENST00000378209.7_exon_5_0_chr1_3872472_f"
    ), class = "factor"), X0 = c(0L, 0L, 0L, 0L, 0L), X. = structure(c(1L, 
    1L, 1L, 1L, 1L), .Label = "+", class = "factor")), class = "data.frame", row.names = c(NA, 
-5L))

Peak Calls

structure(list(seqnames = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "chr1", class = "factor"), 
    start = c(975451L, 1014228L, 1290080L, 1291099L, 1291742L, 
    1327977L), end = c(975952L, 1014729L, 1290581L, 1291600L, 
    1292243L, 1328478L), name = structure(c(5L, 6L, 1L, 2L, 3L, 
    4L), .Label = c("BRCA_123", "BRCA_124", "BRCA_125", "BRCA_143", 
    "BRCA_39", "BRCA_55"), class = "factor"), score = c(1.87842575038562, 
    4.07469686212787, 2.44358820293876, 3.18019908767794, 8.26783029566134, 
    1.08246502080444), annotation = structure(c(1L, 1L, 1L, 1L, 
    1L, 1L), .Label = "3' UTR", class = "factor"), percentGC = c(0.6187624750499, 
    0.62874251497006, 0.678642714570858, 0.702594810379242, 0.640718562874252, 
    0.676646706586826), percentAT = c(0.3812375249501, 0.37125748502994, 
    0.321357285429142, 0.297405189620758, 0.359281437125749, 
    0.323353293413174)), class = "data.frame", row.names = c(NA, 
-6L))

so I for each exon I want to calculate if it overlaps with any of the peaks and if it does what percentage of exon is overlapping the peak AND if an exon overlaps more than one peak I want to report that then I want to store the results in a new table or data frame. other than a for loop I can't think of anything. specially since my data is rather big I'm looking for an efficient code. I'm currently working with R but I can do some coding in ubuntu terminal as well

exon bioconductor R overlap • 93 views
ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by pt.taklifi10

Fyi, if you have data in R and want to share in in an easy copy/paste fashion then use dput() on the object. It will create ASCII representation of the data that you can share here so users can quickly have your example data rather than typing them in. Use can use edit to add content to your post.

ADD REPLYlink written 4 weeks ago by ATpoint42k
1

Ok thanks for advice . I converted my data to ASCII format .

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by pt.taklifi10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1974 users visited in the last hour