Has anyone looked into the similarity of two samples of chip-seq like data using cross-correlation or auto-correlation? Basically I wish to get some sort of metric that describes how similar two wiggle files are on a genome-wide scale. Or inversely how much variance in one sample can be described by looking at the pattern of distribution in the other. If you have, what software or library did you use or alternatively what different approach have you used?
I'm kind of interested in this topic. I'm developer of a package for digital signal processing of quantitative genomic data (such as ChIP-seq). Cross-correlation is something I was implementing too, although I had some issues with the border effect, and also I was not sure xcorr is a proper measure of similarity for two ChIPs. In addition, subtracting one chip from the other (after energy normalization) may have the same effect. d
Seems like every time someone tries to use correlation to measure something like this the 0-0 effect throws things off. I prefer just calling the peaks and counting overlaps (by findOverlaps in R IRanges or using BEDTools).