Genome-wide enrichment of histone marks and correlation with gene expression: Approach/Tools?
Entering edit mode
8 weeks ago
Ankit ▴ 350

Hi everyone,

I have a ChIP-Seq data from multiple histone marks (H3K9me3, H3K27me3, H3K27ac and so on..(n=10)). I want to identify enrichment of these histone marks at genomic features: eg. promoter, gene body, and custom bins of n bp. So that I can correlate this data with differentially expressed genes and confer if the behaviour of histone marks profile influence gene expression. If you can suggest a R package/Tool which can do this entirely it will be the best option.

If not,

Here are the approaches I am following:

  1. Using the bam files of histone marks to predict chromatin state using ChromHMM and overlapping with genomic features to identify those regions which overlap with various chromatin state (the bed file have coordinates of 200 bp span of chromatin states). This approach seems ok but a same promoter coordinate overlaps with multiple chromatin states and it is difficult to determine the exact state of the particular promoter region.
  1. Using the bam files and gene or promoter coordinates, I created a table that contains depth normalized counts. But from this table how to say which histone mark is abundant than the other and directly reflects a particular chromatin state.

I would appreciate any help.

Thank you

histone genomic rna-seq chip-seq chromhmm • 266 views
Entering edit mode
8 weeks ago
LChart 840

Short answer: I haven't managed to find any libraries for performing this kind of analysis end-to-end. Given that the input is multi-variate (many histone marks), the standard approach is to build a model predicting expression from histone modifications (and potentially additional local features such as TF motifs), and then interrogate the model to make inferences.

Most methods for predicting gene expression from chromatin state (histone modifications +/- accessibility) use a binning strategy (counts normalized for library size, potentially also normalized to background or no-antibody control) as opposed to pre-selection of features via ChromHMM; though the ChromHMM output could prove useful as an additional feature or for model interrogation.

GEx predictions from histone marks have been predicted using linear models (also), support vector machines, random forests, and neural networks of various architectures A, B, C, D. Code is available for the application of SVMs ( and attention-based NNs ( or but no corresponding R packages. IntePareto is a package for very simple Z-score based analysis of RNA and histone state.

Entering edit mode

Hi LChart,

Thanks for the suggestions. IntePareto seems good option to start with however it does not provide any information of influence of chromatin state on expression of individual genes. The others are machine learning based models which I am not really good at. I am trying to figure out how to use multiple different packages to make the sense out of data.

I would welcome more suggestions for now.


Login before adding your answer.

Traffic: 1414 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6