Through a collaborator I have received bw files from bisulphite sequencing which give me information regarding methylation percentages for cytosine residues. I need to correlate them with chipseq data for a transcription factor. I haven't worked with methylation data before and thus I am unable to understand how can I get it in a form amenable to correlation analysis.
I think I cannot just do a correlation using deeptools computeMatrix as it will do an average over the window for methylation bw and that is not what is needed here. The other options of median, min, max, std, sum
also dont make much sense. Is there a tool that will into account possible sites that could be potentilly methylated in a window and then gives me a value to be used for correlation.
Any general pointers towards doing methylation and chip-seq correlation may be useful too.
For correlation of genomic tracks with continuous data (irrespective of being intervals or not), I know of StereoGene, but have not used it with methylation data in particular.
I have sometimes done this: classified CpG sites into discrete levels (low-, medium- and high-methylation) and then tested for the association/enrichment of these sets against the .BED of the transcription factor peaks with packages such as LOLA. I admit it is a "rougher" kind of association than a correlation.
I sort of found the answer for myself. In case someone else has a similar issue deeptools multibwsummary works as expected here. Non CG sites have no coverage and are thus not a part of the average calculation