I'm looking for a way to compare ChIP-seq signal to gene expression, and not 100% sure the best way to do this. To be specific, the end goal is to make a scatter plot with (normalized) RNA-seq counts on the X-axis and (normalized) ChIP-seq on the Y, and calculate a correlation coefficient.
I have a matrix of counts (FeatureCounts) from a large RNA-seq experiment, but I don't have access to the raw files. I also have BAM and peak files from several ChIP-seq experiments that I've done (in the same cell type). I'd like to see if the strength of the ChIP-seq signal within genes (as in the coverage in the BAM file or the peak score from the peak file), or possibly within X bp of the start of a gene, correlates to the RNA expression value of that gene. I can obviously normalize the RNA counts prior to comparing to ChIP-seq, but this is where I get lost, I'm not sure how I would do this analysis. One thing I had thought about was running the ChIP BAM files through FeatureCounts as well, but I'm not sure if that's the optimal way to do it, and even from there I'm a little confused about what I would do with that final matrix. Would it be more ideal if I could obtain the RNA BAM files?
I often use DeepTools when I want to find the correlation between two samples, but that's usually more straight-forward because I just use BAMs/bigwigs over bins or a consensus peakset, but using Count data is getting me confused! Thanks in advance for any suggestions/recommendations!