Question: How to Write a loop for a big matrix?
0
gravatar for Jaan
5.8 years ago by
Jaan0
Finland
Jaan0 wrote:

Hi every one,

I have a matrix with 21492 columns and 8532 rows. i am running flexmix package to calculate Gaussian univariate distributions. When i run the following commands the process will crash due to lack of memory, but if i run the first 5000 column [X <- as.matrix(df[,1:5000])] only,  i get the results.

How can i write a loop for "mult.models.fit1" object such a way that, the loop goes through all 21492 columns by reading the subsection of matrix's columns (lets say 3000 columns by 3000 to the last one), with out affecting the over all result of the function "stepFlexmix" itself??!!

>set.seed(1234)
>library(flexmix)
>load(".../MACompendium.RData")
>df <- as.data.frame(eset)
>X <- as.matrix(df)
>dim(X)
[1] 21492  8532
>mult.models.fit1 <- stepFlexmix(X ~ 1, k = 1:6, model = FLXMCmvnorm(), nrep = 15, control = list(minprior = 0)
1***************
2*************error: could not allocate the vector size 1.7 G.B
R • 1.9k views
ADD COMMENTlink modified 5.8 years ago • written 5.8 years ago by Jaan0

do you have to keep whatever  "eset" is at first, and the dataframe and the matrix version of it, in memory?

ADD REPLYlink written 5.8 years ago by David W4.7k

Thanks for the interest.

No i dont need to put the eset (an expression set from Microarray data) nor df in the memory, as soon as i convert it to a matrix (X) version.

ADD REPLYlink modified 5.8 years ago • written 5.8 years ago by Jaan0

Well, rm(eset, df) will free up some room.  From those dimensions, each object will be > 1Gb in memory. 

More generally, I doubt it's possible to you can't split the calculation in multiple chunks and stitch the result back together, the EM algorithm works on an entire data-set.

ADD REPLYlink modified 5.8 years ago • written 5.8 years ago by David W4.7k

Thanks for the heads up, yes i was having the same doubt, and i was searching for a way to over come that EM Algorithm, which looks to whole data set.

By the way, i am running this data set on our server (100 GB memory). I did not use rm(eset, df) because i taught i have enough space if i dont, but no harm to remove them.

ADD REPLYlink written 5.8 years ago by Jaan0

Your error message says you are running out of memory (http://stat.ethz.ch/R-manual/R-devel/library/base/html/Memory-limits.html) there are some packages for using "disk as memory" but I don't know if they'll work with flexMix

ADD REPLYlink written 5.8 years ago by David W4.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 946 users visited in the last hour