Entering edit mode
5 weeks ago
ArtiCore
•
0
Hi,
I'm working with a large dataset where I'm trying to generate a clustered heatmap from z-scores. However, I’m hitting a memory error: "Error: cannot allocate vector of size 614.9 Gb." The data size is simply too large to be processed in one go.
Does anyone have advice on:
- How to handle heatmap generation for such large datasets?
- Are there any methods to process or visualize the data in chunks while retaining meaningful clusters?
- Any tools, R packages, or approaches for optimizing memory usage for this type of task?
I'd appreciate any insights or suggestions—thank you!
What kind of data is this? You should add some information about that to get specific help. Depending on the data type there may be different strategies.
The data are derived from genomic raw count matrix transformed into z-scores. Numeric matrix. In R I am trying to run this command:
See if these help with plotting and with reduction of data size while you wait to get an answer :
Heatmaps In R With Huge Data
Improve Large Heatmap Generation in R
How to plot the heatmap of gene expression for very large data set ?
https://stackoverflow.com/questions/8896778/heatmaps-with-huge-data
https://gdevailly.netlify.app/post/plotting-big-matrices-in-r/
Not sure if this makes a difference in R, but it python one could force the data type to be
float32
rather than a defaultfloat64
type.