mclapply on AWS
1
0
Entering edit mode
2.8 years ago
joe ▴ 240

I'm calling functions with Rscript from an AWS terminal, and I encountered an error when a function uses mclapply(). The function runs fine in RStudio (AWS AMI) when using the same environment, its just from the terminal with Rscript myFunction.R --json=config.json that an error occurs.

mclapply(1:dim(this.split)[1], SomeFun, mc.preschedule=TRUE)

mclapply quits with the error

all scheduled cores encountered errors in user code

Anyone have an idea if a there is some required parameter when calling from the terminal?

Here is my R environment (printed from inside the function)

R version 3.5.1 (2018-07-02)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Amazon Linux 2

Matrix products: default
BLAS/LAPACK: /anaconda3/envs/r35p27/lib/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] rlist_0.4.6.1               stringi_1.4.3
 [3] stringr_1.3.1               seqinr_3.4-5
 [5] rBLAST_0.99.2               ShortRead_1.40.0
 [7] GenomicAlignments_1.18.1    SummarizedExperiment_1.12.0
 [9] DelayedArray_0.8.0          matrixStats_0.54.0
[11] Biobase_2.42.0              Rsamtools_1.34.0
[13] GenomicRanges_1.34.0        GenomeInfoDb_1.18.1
[15] Biostrings_2.50.2           XVector_0.22.0
[17] IRanges_2.16.0              S4Vectors_0.20.1
[19] BiocParallel_1.16.6         BiocGenerics_0.28.0
[21] aws.s3_0.3.12               jsonlite_1.5
[23] configr_0.3.3               optparse_1.6.2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1             RColorBrewer_1.1-2     compiler_3.5.1
 [4] base64enc_0.1-3        bitops_1.0-6           tools_3.5.1
 [7] zlibbioc_1.28.0        digest_0.6.20          lattice_0.20-35
[10] Matrix_1.2-14          yaml_2.2.0             GenomeInfoDbData_1.2.1
[13] hwriter_1.3.2          httr_1.3.1             xml2_1.2.0
[16] ade4_1.7-13            grid_3.5.1             getopt_1.20.2
[19] data.table_1.12.2      glue_1.3.1             R6_2.2.2
[22] latticeExtra_0.6-28    magrittr_1.5           MASS_7.3-51.4
[25] aws.signature_0.5.0    ini_0.3.1              RCurl_1.95-4.12
[28] RcppTOML_0.1.6
R CRAN parallel mclapply AWS • 1.9k views
ADD COMMENT
2
Entering edit mode

Check that it is not trying to use more cores than available and try setting the option mc.cores explicitly to the number of cores you want to use or that are actually available.

ADD REPLY
2
Entering edit mode

Yes, ensure that you have registered the cores properly. I go over this, here: Parallel processing in R

ADD REPLY
3
Entering edit mode
2.8 years ago
joe ▴ 240

Thanks all. I figured out the issue - there were some variables required for the function called by mclapply that I had assigned locally with <- instead of globally with <<- which I guess RStudio could handle and Rscript could not. It might be that the variables were automatically (inadvertently) globally assigned while testing, allowing it to work in RStudio.

ADD COMMENT

Login before adding your answer.

Traffic: 1741 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6