Question: Deeptools plotHeatmap: too much black! missing data assigned incorrectly
gravatar for David K
2.3 years ago by
David K20
Colorado State University
David K20 wrote:

Hello, I'm using deeptools computeMatrix/plotHeatmap, and am getting a large amount of missing data (i.e. black in the graph).

Commands used

The call to computeMatrix takes in ChIPseq peaks, and corresponding signal in bigWig format:

Command 1: computeMatrix

computeMatrix reference-point --referencePoint center -S -R peaks.bed -a 500 -b 500 -o matrix.gz

Note: actual filenames, -p option are omitted

Note: --binSize is omitted or 1

This is followed by:

Command 2: plotHeatmap

plotHeatmap -m $infile -o $ofile --outFileSortedRegions ${ofile/.$outformat/.bed} --sortUsing mean --outFileNameMatrix ${ofile/.$outformat/.mat} --zMin 0

Where: infile=matrix.gz (output from first command); ofile=matrix.$outformat; outformat typically equals "png"


I have verified that the data are indeed not missing, in at least the first few examples. See the highlighted area of the attached graph (red box).

plotHeatmap screenshot

Here is a chunk of the corresponding data stored via the --outFileNameMatrix option:

1.256 1.256 1.256 1.256 1.256 1.256 1.256 1.256 1.256 1.256 1.346 1.346 1.346

I've validated these numbers by looking specifically at this region of the input file (equivalent to in Command 1)

#converted to wiggle
track type=wiggle_0
fixedStep chrom=chrII start=2378124 span=1 step=1

Note: this matches the above matrix output when --binSize is set to 1 for Command 1: computeMatrix; the default setting (10) gives the expected results.

This value range should be printed as yellow according to the figure legend (not in screenshot), and represents the middle range of the data being plotted.

Also, adding the computeMatrix parameter --missingDataAsZero, you would expect all of that black to become red, right? That is not reflected in this plot:

plotHeatmap with black missing data versus zeroes


Does anyone know what might be going on here? Am I running the commands incorrectly (newb)?

ADD COMMENTlink modified 2.3 years ago by Devon Ryan97k • written 2.3 years ago by David K20

Tagging: Devon Ryan

ADD REPLYlink written 2.3 years ago by genomax91k

Annoyingly, I don't actually get notified by these :(

ADD REPLYlink written 2.3 years ago by Devon Ryan97k
gravatar for Devon Ryan
2.3 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:

Every time this issue is brought up it turns out to be due to the interpolation done in order to make a large matrix fit into a much smaller number of pixels. Try doing one of the following:

  1. Make the image larger so there's less interpolation.
  2. Increasing the DPI
  3. Changing the --interpolationMethod option

I have yet to see an example where that doesn't produce the results you're expecting. At the end of the day this comes down to how best to compress a lot of information into a small image. At some point you have to throw information away since there aren't enough pixels to represent everything, so no particular method will always produce the most desirable results.

ADD COMMENTlink written 2.3 years ago by Devon Ryan97k

Thanks! Changing interpolation to "bicubic" made it worse, but "nearest" restored the data. I used "--sortRegions keep" in order to compare missingValues as zeroes versus nans. How does sorting take into account the missing data?

Thanks for the solution!

ADD REPLYlink written 2.3 years ago by David K20

Missing data is ignored during all sorting steps, since otherwise you end up with a block of nicely sorted regions at the top and another block of unsorted regions containing missing data at the bottom. The heatmap and the output file will have the same sorting order in all cases.

ADD REPLYlink written 2.3 years ago by Devon Ryan97k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1338 users visited in the last hour