I have 2 lists of comparisons between 3 conditions: (1->2 and 2->3) - List of differential gene expression (genes) (RNA seq data analyzed with DESeq2) - List of differential H3K27ac binding (peaks) (ChIP seq data analyzed with MAnorm)
I want to find correlation between gene expression dynamics and H3K27ac binding dynamics. For example a gene that is upregulated between condition 1 and condition 2, and also the H3K27ac binding near this gene goes up.
The problem here is that the H3K27ac differential binding data consists of peaks. Each peak has it's own p-value (significance of differential binding). When I annotate these peaks to the nearest gene (bedtools closest), most genes will have more than 1 peak in their surrounding, and also the statistical data (p-value) will be gone. How do I combine these different peaks which annotate to the same gene while retaining the statistical data? What I need is a list of genes ranked on significance (or fold change) of differential H3K27ac binding. When I have this I can compare this with the list of genes ranked on differential expression and find correlation.
Help is much appreciated.