calculating overlaps with Peak calls for each exon
0
0
Entering edit mode
3.5 years ago
pt.taklifi ▴ 60

I have a list of exons and a list of peak Calls in .txt format

Exons

 structure(list(chr1 = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "chr1", class = "factor"), 
    X3857280 = c(3858717L, 3865811L, 3867973L, 3869604L, 3872471L
    ), X3857717 = c(3858844L, 3866000L, 3868053L, 3869775L, 3872572L
    ), ENST00000378209.7_exon_0_0_chr1_3857281_f = structure(1:5, .Label = c("ENST00000378209.7_exon_1_0_chr1_3858718_f", 
    "ENST00000378209.7_exon_2_0_chr1_3865812_f", "ENST00000378209.7_exon_3_0_chr1_3867974_f", 
    "ENST00000378209.7_exon_4_0_chr1_3869605_f", "ENST00000378209.7_exon_5_0_chr1_3872472_f"
    ), class = "factor"), X0 = c(0L, 0L, 0L, 0L, 0L), X. = structure(c(1L, 
    1L, 1L, 1L, 1L), .Label = "+", class = "factor")), class = "data.frame", row.names = c(NA, 
-5L))

Peak Calls

structure(list(seqnames = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "chr1", class = "factor"), 
    start = c(975451L, 1014228L, 1290080L, 1291099L, 1291742L, 
    1327977L), end = c(975952L, 1014729L, 1290581L, 1291600L, 
    1292243L, 1328478L), name = structure(c(5L, 6L, 1L, 2L, 3L, 
    4L), .Label = c("BRCA_123", "BRCA_124", "BRCA_125", "BRCA_143", 
    "BRCA_39", "BRCA_55"), class = "factor"), score = c(1.87842575038562, 
    4.07469686212787, 2.44358820293876, 3.18019908767794, 8.26783029566134, 
    1.08246502080444), annotation = structure(c(1L, 1L, 1L, 1L, 
    1L, 1L), .Label = "3' UTR", class = "factor"), percentGC = c(0.6187624750499, 
    0.62874251497006, 0.678642714570858, 0.702594810379242, 0.640718562874252, 
    0.676646706586826), percentAT = c(0.3812375249501, 0.37125748502994, 
    0.321357285429142, 0.297405189620758, 0.359281437125749, 
    0.323353293413174)), class = "data.frame", row.names = c(NA, 
-6L))

so I for each exon I want to calculate if it overlaps with any of the peaks and if it does what percentage of exon is overlapping the peak AND if an exon overlaps more than one peak I want to report that then I want to store the results in a new table or data frame. other than a for loop I can't think of anything. specially since my data is rather big I'm looking for an efficient code. I'm currently working with R but I can do some coding in ubuntu terminal as well

R overlap bioconductor exon • 629 views
ADD COMMENT
0
Entering edit mode

Fyi, if you have data in R and want to share in in an easy copy/paste fashion then use dput() on the object. It will create ASCII representation of the data that you can share here so users can quickly have your example data rather than typing them in. Use can use edit to add content to your post.

ADD REPLY
1
Entering edit mode

Ok thanks for advice . I converted my data to ASCII format .

ADD REPLY

Login before adding your answer.

Traffic: 2072 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6