Question: Normalizing using Seurat with large numbers of samples
gravatar for Kristin Muench
18 months ago by
United States
Kristin Muench510 wrote:


I am trying to run Seurat on a fairly large scRNA-Seq experiment, with 16 samples ranging from 1000-10,000 cells.

In my first run of the pipeline, I merged all of the samples into a single Seurat object, like so:

data.combined <- MergeSeurat(object1 = J, object2 = E, add.cell.id1 = "J", 
    add.cell.id2 = "E", project = "all")
data.combined <- AddSamples(object = data.combined, =, = "F")

...and then followed the tutorial. However, on the scaling step:

data.combined <- ScaleData(object = data.combined, = c("nUMI"))

I get an error:

Error: vector memory exhausted (limit reached?)

I see that this is associated with running out of RAM with which to do the computation, which isn't surprising given the size of data.combined. How can I overcome this, short of finding a computational cluster to run this on? Due to the large differences in the number of UMIs between the 1000 and 10,000 cells samples, it seems really crucial to run this step on a Seurat object containing all the data, rather than hack together a solution where I run ScaleData on subsets of data and then tack them all together afterwards..

Thank you for your help! I am working on an iMac with 16 GB of RAM. sessionInfo() is:

R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] RColorBrewer_1.1-2          gdtools_0.1.7               biomaRt_2.36.1              ggrepel_0.8.0               edgeR_3.22.5               
 [6] limma_3.36.5                readr_1.1.1                 DESeqAid_0.2                DESeq2_1.20.0               SummarizedExperiment_1.10.1
[11] DelayedArray_0.6.6          BiocParallel_1.14.2         matrixStats_0.54.0          Biobase_2.40.0              GenomicRanges_1.32.7       
[16] GenomeInfoDb_1.16.0         IRanges_2.14.12             S4Vectors_0.18.3            BiocGenerics_0.26.0         bindrcpp_0.2.2             
[21] dplyr_0.7.6                 Seurat_2.3.4                Matrix_1.2-14               cowplot_0.9.4               ggplot2_3.0.0              

loaded via a namespace (and not attached):
  [1] snow_0.4-3             backports_1.1.2        Hmisc_4.1-1            plyr_1.8.4             igraph_1.2.2           lazyeval_0.2.1        
  [7] splines_3.5.1          digest_0.6.18          foreach_1.4.4          htmltools_0.3.6        lars_1.2               gdata_2.18.0          
 [13] magrittr_1.5           checkmate_1.8.5        memoise_1.1.0          cluster_2.0.7-1        mixtools_1.1.0         ROCR_1.0-7            
 [19] annotate_1.58.0        R.utils_2.7.0          svglite_1.2.1          prettyunits_1.0.2      colorspace_1.3-2       blob_1.1.1            
 [25] crayon_1.3.4           RCurl_1.95-4.11        jsonlite_1.5           genefilter_1.62.0      bindr_0.1.1            survival_2.42-6       
 [31] zoo_1.8-4              iterators_1.0.10       ape_5.2                glue_1.3.0             gtable_0.2.0           zlibbioc_1.26.0       
 [37] XVector_0.20.0         kernlab_0.9-27         prabclus_2.2-6         DEoptimR_1.0-8         scales_1.0.0           mvtnorm_1.0-8         
 [43] DBI_1.0.0              bibtex_0.4.2           Rcpp_0.12.19           metap_1.0              dtw_1.20-1             progress_1.2.0        
 [49] xtable_1.8-3           htmlTable_1.12         reticulate_1.10        foreign_0.8-71         bit_1.1-14             proxy_0.4-22          
 [55] mclust_5.4.2           SDMTools_1.1-221       Formula_1.2-3          tsne_0.1-3             htmlwidgets_1.3        httr_1.3.1            
 [61] gplots_3.0.1           fpc_2.1-11.1           acepack_1.4.1          modeltools_0.2-22      ica_1.0-2              pkgconfig_2.0.2       
 [67] XML_3.98-1.16          R.methodsS3_1.7.1      flexmix_2.3-14         nnet_7.3-12            locfit_1.5-9.1         tidyselect_0.2.5      
 [73] labeling_0.3           rlang_0.2.2            reshape2_1.4.3         AnnotationDbi_1.42.1   munsell_0.5.0          tools_3.5.1           
 [79] RSQLite_2.1.1          ggridges_0.5.1         evaluate_0.12          stringr_1.3.1          yaml_2.2.0             npsurv_0.4-0          
 [85] knitr_1.20             bit64_0.9-7            fitdistrplus_1.0-11    robustbase_0.93-3      caTools_1.17.1.1       purrr_0.2.5           
 [91] RANN_2.6.1             pbapply_1.3-4          nlme_3.1-137           R.oo_1.22.0            hdf5r_1.0.1            compiler_3.5.1        
 [97] rstudioapi_0.8         curl_3.2               png_0.1-7              lsei_1.2-0             statmod_1.4.30         tibble_1.4.2          
[103] geneplotter_1.58.0     stringi_1.2.4          lattice_0.20-35        trimcluster_0.1-2.1    pillar_1.3.0           Rdpack_0.10-1         
[109] lmtest_0.9-36          data.table_1.11.8      bitops_1.0-6           irlba_2.3.2            gbRd_0.4-11            R6_2.3.0              
[115] latticeExtra_0.6-28    KernSmooth_2.23-15     gridExtra_2.3          codetools_0.2-15       MASS_7.3-50            gtools_3.8.1          
[121] assertthat_0.2.0       rprojroot_1.3-2        withr_2.1.2            GenomeInfoDbData_1.1.0 hms_0.4.2              diptest_0.75-7        
[127] doSNOW_1.0.16          grid_3.5.1             rpart_4.1-13           tidyr_0.8.2            class_7.3-14           rmarkdown_1.10        
[133] segmented_0.5-3.0      Rtsne_0.15             base64enc_0.1-3
single cell R scrna-seq • 1.2k views
ADD COMMENTlink modified 16 months ago • written 18 months ago by Kristin Muench510
gravatar for Kristin Muench
16 months ago by
United States
Kristin Muench510 wrote:

Update from the future: I did end up running these scripts on a computational cluster using a job scheduler. Once I was able to run these scripts with between 64-128 GB of RAM, I no longer received this issue, and the scripts ran as expected.

To run an R script via command line instead of via RStudio, I use RScript:

ADD COMMENTlink written 16 months ago by Kristin Muench510
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1372 users visited in the last hour