Question: Normalizing using Seurat with large numbers of samples
4 weeks ago by
United States
Kristin Muench350


I am trying to run Seurat on a fairly large scRNA-Seq experiment, with 16 samples ranging from 1000-10,000 cells.

In my first run of the pipeline, I merged all of the samples into a single Seurat object, like so:

data.combined <- MergeSeurat(object1 = J, object2 = E, add.cell.id1 = "J", 
    add.cell.id2 = "E", project = "all")
data.combined <- AddSamples(object = data.combined, =, = "F")

...and then followed the tutorial. However, on the scaling step:

data.combined <- ScaleData(object = data.combined, = c("nUMI"))

I get an error:

Error: vector memory exhausted (limit reached?)

I see that this is associated with running out of RAM with which to do the computation, which isn't surprising given the size of data.combined. How can I overcome this, short of finding a computational cluster to run this on? Due to the large differences in the number of UMIs between the 1000 and 10,000 cells samples, it seems really crucial to run this step on a Seurat object containing all the data, rather than hack together a solution where I run ScaleData on subsets of data and then tack them all together afterwards..

Thank you for your help! I am working on an iMac with 16 GB of RAM. sessionInfo() is:

R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] RColorBrewer_1.1-2          gdtools_0.1.7               biomaRt_2.36.1              ggrepel_0.8.0               edgeR_3.22.5               
 [6] limma_3.36.5                readr_1.1.1                 DESeqAid_0.2                DESeq2_1.20.0               SummarizedExperiment_1.10.1
[11] DelayedArray_0.6.6          BiocParallel_1.14.2         matrixStats_0.54.0          Biobase_2.40.0              GenomicRanges_1.32.7       
[16] GenomeInfoDb_1.16.0         IRanges_2.14.12             S4Vectors_0.18.3            BiocGenerics_0.26.0         bindrcpp_0.2.2             
[21] dplyr_0.7.6                 Seurat_2.3.4                Matrix_1.2-14               cowplot_0.9.4               ggplot2_3.0.0              

loaded via a namespace (and not attached):
  [1] snow_0.4-3             backports_1.1.2        Hmisc_4.1-1            plyr_1.8.4             igraph_1.2.2           lazyeval_0.2.1        
  [7] splines_3.5.1          digest_0.6.18          foreach_1.4.4          htmltools_0.3.6        lars_1.2               gdata_2.18.0          
 [13] magrittr_1.5           checkmate_1.8.5        memoise_1.1.0          cluster_2.0.7-1        mixtools_1.1.0         ROCR_1.0-7            
 [19] annotate_1.58.0        R.utils_2.7.0          svglite_1.2.1          prettyunits_1.0.2      colorspace_1.3-2       blob_1.1.1            
 [25] crayon_1.3.4           RCurl_1.95-4.11        jsonlite_1.5           genefilter_1.62.0      bindr_0.1.1            survival_2.42-6       
 [31] zoo_1.8-4              iterators_1.0.10       ape_5.2                glue_1.3.0             gtable_0.2.0           zlibbioc_1.26.0       
 [37] XVector_0.20.0         kernlab_0.9-27         prabclus_2.2-6         DEoptimR_1.0-8         scales_1.0.0           mvtnorm_1.0-8         
 [43] DBI_1.0.0              bibtex_0.4.2           Rcpp_0.12.19           metap_1.0              dtw_1.20-1             progress_1.2.0        
 [49] xtable_1.8-3           htmlTable_1.12         reticulate_1.10        foreign_0.8-71         bit_1.1-14             proxy_0.4-22          
 [55] mclust_5.4.2           SDMTools_1.1-221       Formula_1.2-3          tsne_0.1-3             htmlwidgets_1.3        httr_1.3.1            
 [61] gplots_3.0.1           fpc_2.1-11.1           acepack_1.4.1          modeltools_0.2-22      ica_1.0-2              pkgconfig_2.0.2       
 [67] XML_3.98-1.16          R.methodsS3_1.7.1      flexmix_2.3-14         nnet_7.3-12            locfit_1.5-9.1         tidyselect_0.2.5      
 [73] labeling_0.3           rlang_0.2.2            reshape2_1.4.3         AnnotationDbi_1.42.1   munsell_0.5.0          tools_3.5.1           
 [79] RSQLite_2.1.1          ggridges_0.5.1         evaluate_0.12          stringr_1.3.1          yaml_2.2.0             npsurv_0.4-0          
 [85] knitr_1.20             bit64_0.9-7            fitdistrplus_1.0-11    robustbase_0.93-3      caTools_1.17.1.1       purrr_0.2.5           
 [91] RANN_2.6.1             pbapply_1.3-4          nlme_3.1-137           R.oo_1.22.0            hdf5r_1.0.1            compiler_3.5.1        
 [97] rstudioapi_0.8         curl_3.2               png_0.1-7              lsei_1.2-0             statmod_1.4.30         tibble_1.4.2          
[103] geneplotter_1.58.0     stringi_1.2.4          lattice_0.20-35        trimcluster_0.1-2.1    pillar_1.3.0           Rdpack_0.10-1         
[109] lmtest_0.9-36          data.table_1.11.8      bitops_1.0-6           irlba_2.3.2            gbRd_0.4-11            R6_2.3.0              
[115] latticeExtra_0.6-28    KernSmooth_2.23-15     gridExtra_2.3          codetools_0.2-15       MASS_7.3-50            gtools_3.8.1          
[121] assertthat_0.2.0       rprojroot_1.3-2        withr_2.1.2            GenomeInfoDbData_1.1.0 hms_0.4.2              diptest_0.75-7        
[127] doSNOW_1.0.16          grid_3.5.1             rpart_4.1-13           tidyr_0.8.2            class_7.3-14           rmarkdown_1.10        
[133] segmented_0.5-3.0      Rtsne_0.15             base64enc_0.1-3
single cell R scrna-seq • 138 views
written 4 weeks ago by Kristin Muench350
