Question: How to apply ChromHMM on ChromImpute output?
gravatar for Bioinformatist Newbie
2.0 years ago by
Bioinformatist Newbie230 wrote:

I want to make a genome segmentation for a cell type of interest by using 6 different histone marks. Unfortunately, only 5 histone marks are available for that cell type so I decided to first use ChromImpute to impute the signals for the missing histone marks. After following the manual of ChromImpute (Manual), I obtained the imputed signal for histone mark of interest (one .wig file for each chromosme).

Next step is to binarize these signal values by using BinarizeSignal from ChromHMM. According to the ChromHMM reference manual (ChromHMM manual) it can be done by using BinarizeSignal. I prepared the input files according to the given information for 2 chromosomes (input files looks like below) to first test if everything goes well:

HEK293 Chr10

HEK293 Chr11

I used the following command for binarization of signal files (the input files have _signal_ in their names as well, to be precise, filenames are: chr10_signal_HEK293_H3K27me3.wig and chr11_signal_HEK293_H3K27me3.wig):

java -jar ChromHMM.jar BinarizeSignal ./inputdir ./outputdir

I got the following output:

Writing to file ./outputdir//HEK293_chr10_binary.txt
Writing to file ./outputdir//HEK293_chr11_binary.txt

But when I checked the binarized files they are just 0 values, there is not a single 1 in the entire files. When I checked my input signal files which are binned into 25 bp windows, many of the bins have non zero values (e.g. 0.2, 1.3, 0.2 etc..). So, I think I'm making a mistake in using the BinarizeSignal.

Can anybody guide me how to solve this issue? Thank you.

ADD COMMENTlink modified 16 months ago by isalinas22110 • written 2.0 years ago by Bioinformatist Newbie230

I've tried using the controlsignal as well but still the output file is assigning 0 value to every bin. Maybe it has something to do with the way files are converted into signal values, as mentioned in the Note in ChromHMM manual: Note the binarization from signal is designed only for signal data which represent counts of reads assigned to bins such as the optional output from the BinarizeBed command. If the signal was computed in other ways, then the binarization based on the poisson distribution may not give meaningful results. I have also rounded the input signal values to 0 decimal to look exactly the same as in sample input format but still problem is the same.

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by Bioinformatist Newbie230

To check on possible options, I used the following dummy data:

Cell    chr1
Mark1   Mark2   Mark3
0       4       0
1       3       0
2       1       9
0       0       0
0       0       5
0       8       0
9       0       0

and the binarization I obtained was:

Cell    chr1
Mark1   Mark2   Mark3
0       0       0
0       0       0
0       0       1
0       0       0
0       0       0
0       0       0
1       0       0

Which means that for being binarized into 1 a bin should have minimum signal value of 9, and my actual input data is not having any signal greater than 4 in one of the examples. May be that's the reason all the binarized file is only having 0 values. I don't know if it makes sense or not !!

ADD REPLYlink written 2.0 years ago by Bioinformatist Newbie230

As you mention, ChromHMM expects read counts. Even if you transform your data somehow to make it work with ChromHMM, how will you interpret the results? I would expect the HMM to see the imputed data as a combination of the other vars anyway, making it redundant. I guess I'm not sure there's an advantage to using the imputed data.

ADD REPLYlink modified 24 months ago • written 24 months ago by Ryan Dale4.8k
gravatar for isalinas221
16 months ago by
isalinas22110 wrote:

Hi, did you solved the binarization problem?

First, ChromImpute outputs signal at a 25bp resolution and the minimum signal bins taken by ChromHMM is 200bp, so you have to change the original resolution by summarizing or averaging bins at the wig files. Once done, you can call peaks over the signal files using macs2 and provide the necessary information to the 'binarizeBed -peaks' command.

Another way of doing this is by generating the signal matrices expected by binarizeSignal command as described in the ChromHMM manual.

ADD COMMENTlink written 16 months ago by isalinas22110
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 758 users visited in the last hour