I have a very large file containing differentially methylated cytosines and another file containing annotations of genes including CDS etc. I need to get a graph like this
I wanted to use genomic ranges for it so that I can then find out how many of them are >5kb away from TSS and so that I can do other manipulations.
But GenomicRanges does not accept sites with same start and end as is the case with differentially methylated cytosines. So I was wondering if somebody has done it using GenomicRanges.
Another way I can do it is use binary search ( using file:sorted:seek in perl) but I am looking for a cookbook solution if it is available.
Can anyone explain how to get the fold enrichment in the bar plot and the p-value.
The fold-enrichment there is just the number of DmC/total mC in a given region divided by that in CDS. At least that's how it appears, though I suspect that the figure legend would actually say. I would presume that the p-values are from a Fisher's test.