This is a cross-post from the Satija github forum; I thought I may get more eyes on this forum so I'm also posting here:
Session info: R version 3.6.1 (2019-07-05) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6
Im running a macbook with 8GB of RAM. I am following this vignette for CITESeq However I am only loading and working with the ADT data. I am starting the vignette from the "Cluster directly on protein levels" section.
Everything is fine until I get to the following command:
adt.dist <- dist(t(adt.data)) which returns the following:
Error: vector memory exhausted (limit reached?)
I have tried setting my R_MAX_VSIZE variable anywhere from 8GB to 700GB as suggested on stackoverflow when trying to troubleshoot this. I also check this value is correct when I load up R using
In order to maximize my chances of success, just prior to the troublesome line of code being executed I have cleared all unused objects from the work space and I also ran garbage collection.
When I do this I run
mem_used() and it returns a value of 387 MB.
object.size(adt.data) is the only variable in the workspace prior to running dist() and it returns a value of 212MB
I can't think of anything else to try. It doesn't feel like my machine is incapable of running this, it doesn't seem that big. Is there another solution to this problem? Please let me know if you'd like any additional information. Thanks so much!
Edit: I just tried running it on a friends machine and the same error came up only it said:
Error: cannot allocate vector of size 2025.0 Gb
Well I guess that's the problem.... I don't have 2 Tb.... is there a way to shrink this or run an alternate type of PCA in order to do the clustering with the ADT data alone?
Edit 2: I just tried changing R_MAX_VSIZE to 2200Gb and rerunning. The program accepted it and I let it cook for awhile and came back an hour later and got the following message:
R session Aborted. R encountered a fatal error. The session was terminated