Question: Parallel computing while using R packages
0
gravatar for pixie@bioinfo
10 months ago by
pixie@bioinfo1.4k
pixie@bioinfo1.4k wrote:

A number of bioinformatic R packages allow us to specify the no. of threads (k) in the server. I am using the wTO package https://cran.r-project.org/web/packages/wTO/index.html
used for construction of correlation networks which uses the library(parallel) for using multiple threads. I am running the following function:

x= wTO.Complete(k = 32, n = 100, Data, Overlap, method = "p", method_resampling = "Bootstrap", pvalmethod = "BH", savecor = F, expected.diff = 0.2, lag = NULL, normalize = F, plot = T)

Here, k=32 for a CentOS server. While using the package, I am getting the error:

Error in unserialize(node$con) : error reading from connection

The developer of the package told me its due to an open connection with the cluster and was not properly closed, and asked me to run the following piece of code.

require(parallel)
cl <- makeCluster(32)
stopCluster(cl)

I ran the above and again ran the command for network construction. I still got the error. I will be very grateful if anyone can help me with this.

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS release 6.8 (Final)

Matrix products: default
BLAS: /usr/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] wTO_1.6

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12      visNetwork_2.0.3  digest_0.6.12     plyr_1.8.4
 [5] jsonlite_1.5      magrittr_1.5      stringi_1.1.5     som_0.3-5.1
 [9] reshape2_1.4.2    data.table_1.10.4 tools_3.4.1       stringr_1.2.0
[13] htmlwidgets_0.9   igraph_1.2.1      compiler_3.4.1    pkgconfig_2.0.1
[17] htmltools_0.3.6
R • 586 views
ADD COMMENTlink modified 10 months ago by Devon Ryan89k • written 10 months ago by pixie@bioinfo1.4k

You tend to get that when you run out of memory.

ADD REPLYlink written 10 months ago by Devon Ryan89k

Hi Devon, I actually successfully constructed a gene correlation network of comparable size (actually slightly bigger) using the same package and the same server. However, this error I am getting after that.

ADD REPLYlink modified 10 months ago • written 10 months ago by pixie@bioinfo1.4k

Use a smaller number, i.e., other than 32 for cl <- makeCluster(32)

Here is how I select the number of cores:

require(doParallel)
cores <- makeCluster(detectCores(), type='PSOCK')
registerDoParallel(cores)
Sys.setenv("MC_CORES"=cores)

require(parallel)
options("mc.cores"=cores)

That will configure the cores for the majority (all) R functions.

ADD REPLYlink written 10 months ago by Kevin Blighe41k

One should note that detectCores() may produce an excessively large number of cores (e.g., for me it produces between 64 and 144, depending on the server/node), which may then not have the desired effect :)

ADD REPLYlink written 10 months ago by Devon Ryan89k

Yes, good point. One can always just set the cores manually with, for example, cores <- 4

ADD REPLYlink modified 10 months ago • written 10 months ago by Kevin Blighe41k

Thank you, I will try this out. I am really stuck with this issue.

ADD REPLYlink written 10 months ago by pixie@bioinfo1.4k

What appears that the program will be executed only if I give k=1 and takes almost 10-11 hrs for one network to be constructed. For every other value, it throws the above error.

ADD REPLYlink written 10 months ago by pixie@bioinfo1.4k
1

That's weird. Which parallel package version have you used? The authors do advertise that the function is parallel processing enabled?

I had to edit the clusGap function in order to make it parallelised. Took a day to figure it out: https://github.com/kevinblighe/clusGapKB

The authors never advertised that it was parallel-processing enabled though.

ADD REPLYlink modified 10 months ago • written 10 months ago by Kevin Blighe41k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1171 users visited in the last hour