Question: ChromHMM Output Descriptions
gravatar for Sinji
3.5 years ago by
UT Southwestern Medical Center
Sinji3.0k wrote:

I've been doing some work that involves characterizing potential chromatin states in a HCT116 cell model. I've successfully been able to run ChromHMM to identify chromatin states using a variety of histone markers and then overlap them with other datasets in order to double-check their annotation.

However, I am having a problem understanding some of the output files that ChromHMM automatically generates. Specifically _emissions.txt and _*.bed. I know there's a couple of people here that are really familiar with the software and could probably help me out.

I have already searched google, and read the ChromHMM manuscript, but neither provided answers.

chromhmm • 2.7k views
ADD COMMENTlink modified 2.2 years ago by Roman Hillje40 • written 3.5 years ago by Sinji3.0k
gravatar for Ryan Dale
3.5 years ago by
Ryan Dale4.9k
Bethesda, MD
Ryan Dale4.9k wrote:

The _emissions.txt are the values that go into the _emissions.png figures. Each row is a state, each column is an input data file ("mark" or histone mark in the terminology of ChromHMM). Darker blue indicates a higher likelihood of finding that mark in that state. These, combined with running OverlapEnrichment with biologically meaningful datasets, are critical for figuring out how to interpret the states.

The segments.bed file partitions the genome into contiguous segments, and the names of each feature in that file (E1, E2, etc) correspond to the states (1, 2, etc) in the _emissions.png.

A typical workflow is to figure out what to label each state. Then choose some colors and post-process the BED file with labels and names to get something more useful for downstream analysis.

ADD COMMENTlink written 3.5 years ago by Ryan Dale4.9k

Appreciate the information!

Do the emission values go directly on the png, or do they first have to be modified in some way? I have some values of 0.02 as an example, but a 6 in others. Would the 0.02 be treated as a 0?

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by Sinji3.0k

Not sure if they're normalized in some way. To figure that out, you need to read the source code or try to reproduce the png given the txt file (and see what, if any, normalization needs to happen). Given the lack of a colormap though, my guess would be that each emissions.txt file is divided by the max of that file.

ADD REPLYlink written 3.5 years ago by Ryan Dale4.9k

How to figure out the label of each state? I got the output of chromHMM,but can't find the annotation information of each state?Tanks

ADD REPLYlink written 2.4 years ago by weixiaoyu0

The label of each state is subjective. Coming up with good labels requires looking carefully at the enrichments (from running OverlapEnrichment) and emissions heatmaps to decide what you want to name them.

ADD REPLYlink written 2.4 years ago by Ryan Dale4.9k
gravatar for Roman Hillje
2.2 years ago by
Roman Hillje40
Milan, Italy
Roman Hillje40 wrote:

I'm currently studying how ChromHMM produces its output with a particular interest in the relationship between the enrichment values found in the _overlap.txt file and the colors in the heatmap (since I need to reproduce them). I went through the ChromHMM code and found that the heatmap is produced using the JHeatChart library found here:

The library itself does not perform any scaling/normalization of the input values. Instead, this is already done by ChromHMM. Unless specified differently, each column gets its own color scale. It subtracts the minimum value in the column and then divides by the maximum column value. The alternative option is a scale based on the values across all columns (activated through -uniformscale in the OverlapEnrichment command).

I hope this helps to understand the connection between values and heatmap colors. Yet, I'm still not sure how to interpret the enrichment values in the _overlap.txt file.

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Roman Hillje40
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 852 users visited in the last hour