Hi all, I would like to pose a general question regarding canonical pathway analysis/functional analysis on ChIP-seq (as well as ATAC-seq and other equivalent sequencing techniques).
Functional analysis that looks for enriched canonical pathways is usually an important part of ChIP-seq downstream analysis after peak calling, with many a tools and reference databases that one can use for this purpose. I do not wish to debate which is best, but to rather ask what the appropriate input should be. The basis of pathway analysis only makes sense to me if only promoter peaks are used, where there is a high likelihood that TF binding within the promoter/near the TSS of a gene is really regulating the transcription of said gene. However, for peaks that are elsewhere within the genebody (likely as an enhancer), the link is much weaker, since enhancer-target gene map (especially for highly niche tissue contexts) is much less well-characterised.
Therefore, shouldn't we just leave non-promoter peaks out of the equation altogether, to avoid possible false results? Given a new sequencing dataset, I tend to run the analysis both ways (total vs promoter) and tend to get quite different results (makes sense since these are essentially different datasets). Is that what everyone else is doing (and am I just missing out)?