Question: Recover experimental data from a heat map?
 
4
 
 

Ideally the data behind heat maps are just available from the article (e.g. as supplementary information), or stored in data repositories, either institutional or general services like Dryad or FigShare. Of course, I can still email the corresponding author (but he is in fact on travel until July 11th).

However, only too often this practically does not work out well or not fast enough. As such, text-mining efforts have found enormous use in bioinformatics, including analysis of graphics, like scatter plots (see below), and OCR-ing of chemical structures.

Therefore, I was wondering if there is a R package (or something similar) available to recover experimental data from a heat map like the one shown here? Like the digitize package does for scatter plots... I am more than willing to oversee the limitations, like lack of resolution in end point, or non-linear transformations they did with choosing colors; these I can easily correct for in the data analysis.

alt text

 
 
 
2

Why don't yo ask the author for the data?

log in to reply • written 10 months ago by Lyco  187112
 

What makes you think I did not?

log in to reply • written 10 months ago by Egon Willighagen  4171518

1 answer

 
5
 
 

I'm pretty sure that this is going to be an exercise in futility. Even if you can extract intensity values, there are a bunch of unknowns that could trip you up. For one, you don't know how the data is scaled. Does every fold-increase in intensity correspond to a fold-increase in value? A two-fold change? Is it a linear correlation or a log-scaled one? You don't even know if the data was transformed in some way (logarithmically?) prior to plotting to make it look nicer.

My suggestion would be to write the authors first, as most journals should require that data related to the main figures be released. If they're not cooperative, go find yourself another data set.

 
 
 
3

In addition, a heat map is an approximation of the original data, scaled according to the colour ramp that was used. If data are not available in usable formats, I think we should focus our efforts on shaming those involved and advocating change :-)

log in to reply • written 10 months ago by Neilfws ♦♦ 286011949
 

Chris, mmm... sure, sure. Why did I not think of that. Oh wait, I did. The whole idea of text-mining has this issue to more or less extend, and, yes, Open Data would be ideal. Unfortunately, not everyone, or every journal, seems to agree with that. They're fair points, a little obvious, and not answering the question.

log in to reply • written 10 months ago by Egon Willighagen  4171518
 

Chris, mmm... sure, sure. Why did I not think of that. Oh wait, I did. I make answers like this too, and I appreciate your comments nevertheless, but it should be posted as a comment, not an answer, I guess.

The whole idea of text-mining has this issue to more or less extend, and, yes, Open Data would be ideal. Unfortunately, not everyone, or every journal, seems to agree with that. They're fair points, a little obvious, and not answering the question.

log in to reply • written 10 months ago by Egon Willighagen  4171518
 

Guys, thanx for your comment. I do apologize for being lazy, and skipped the side issues, and just went for the question... I appreciate your (obvious) concerns, though, but perhaps more as a comment than as a answer which it isn't...

log in to reply • written 10 months ago by Egon Willighagen  4171518
 
Log in to add a post