Question: Normalizing using Seurat with large numbers of samples
0
gravatar for Kristin Muench
4 months ago by
United States
Kristin Muench410 wrote:

Hello,

I am trying to run Seurat on a fairly large scRNA-Seq experiment, with 16 samples ranging from 1000-10,000 cells.

In my first run of the pipeline, I merged all of the samples into a single Seurat object, like so:

data.combined <- MergeSeurat(object1 = J, object2 = E, add.cell.id1 = "J", 
    add.cell.id2 = "E", project = "all")
data.combined <- AddSamples(object = data.combined, new.data = F.data, add.cell.id = "F")

...and then followed the tutorial. However, on the scaling step:

data.combined <- ScaleData(object = data.combined, vars.to.regress = c("nUMI"))

I get an error:

Error: vector memory exhausted (limit reached?)

I see that this is associated with running out of RAM with which to do the computation, which isn't surprising given the size of data.combined. How can I overcome this, short of finding a computational cluster to run this on? Due to the large differences in the number of UMIs between the 1000 and 10,000 cells samples, it seems really crucial to run this step on a Seurat object containing all the data, rather than hack together a solution where I run ScaleData on subsets of data and then tack them all together afterwards..

Thank you for your help! I am working on an iMac with 16 GB of RAM. sessionInfo() is:

R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] RColorBrewer_1.1-2          gdtools_0.1.7               biomaRt_2.36.1              ggrepel_0.8.0               edgeR_3.22.5               
 [6] limma_3.36.5                readr_1.1.1                 DESeqAid_0.2                DESeq2_1.20.0               SummarizedExperiment_1.10.1
[11] DelayedArray_0.6.6          BiocParallel_1.14.2         matrixStats_0.54.0          Biobase_2.40.0              GenomicRanges_1.32.7       
[16] GenomeInfoDb_1.16.0         IRanges_2.14.12             S4Vectors_0.18.3            BiocGenerics_0.26.0         bindrcpp_0.2.2             
[21] dplyr_0.7.6                 Seurat_2.3.4                Matrix_1.2-14               cowplot_0.9.4               ggplot2_3.0.0              

loaded via a namespace (and not attached):
  [1] snow_0.4-3             backports_1.1.2        Hmisc_4.1-1            plyr_1.8.4             igraph_1.2.2           lazyeval_0.2.1        
  [7] splines_3.5.1          digest_0.6.18          foreach_1.4.4          htmltools_0.3.6        lars_1.2               gdata_2.18.0          
 [13] magrittr_1.5           checkmate_1.8.5        memoise_1.1.0          cluster_2.0.7-1        mixtools_1.1.0         ROCR_1.0-7            
 [19] annotate_1.58.0        R.utils_2.7.0          svglite_1.2.1          prettyunits_1.0.2      colorspace_1.3-2       blob_1.1.1            
 [25] crayon_1.3.4           RCurl_1.95-4.11        jsonlite_1.5           genefilter_1.62.0      bindr_0.1.1            survival_2.42-6       
 [31] zoo_1.8-4              iterators_1.0.10       ape_5.2                glue_1.3.0             gtable_0.2.0           zlibbioc_1.26.0       
 [37] XVector_0.20.0         kernlab_0.9-27         prabclus_2.2-6         DEoptimR_1.0-8         scales_1.0.0           mvtnorm_1.0-8         
 [43] DBI_1.0.0              bibtex_0.4.2           Rcpp_0.12.19           metap_1.0              dtw_1.20-1             progress_1.2.0        
 [49] xtable_1.8-3           htmlTable_1.12         reticulate_1.10        foreign_0.8-71         bit_1.1-14             proxy_0.4-22          
 [55] mclust_5.4.2           SDMTools_1.1-221       Formula_1.2-3          tsne_0.1-3             htmlwidgets_1.3        httr_1.3.1            
 [61] gplots_3.0.1           fpc_2.1-11.1           acepack_1.4.1          modeltools_0.2-22      ica_1.0-2              pkgconfig_2.0.2       
 [67] XML_3.98-1.16          R.methodsS3_1.7.1      flexmix_2.3-14         nnet_7.3-12            locfit_1.5-9.1         tidyselect_0.2.5      
 [73] labeling_0.3           rlang_0.2.2            reshape2_1.4.3         AnnotationDbi_1.42.1   munsell_0.5.0          tools_3.5.1           
 [79] RSQLite_2.1.1          ggridges_0.5.1         evaluate_0.12          stringr_1.3.1          yaml_2.2.0             npsurv_0.4-0          
 [85] knitr_1.20             bit64_0.9-7            fitdistrplus_1.0-11    robustbase_0.93-3      caTools_1.17.1.1       purrr_0.2.5           
 [91] RANN_2.6.1             pbapply_1.3-4          nlme_3.1-137           R.oo_1.22.0            hdf5r_1.0.1            compiler_3.5.1        
 [97] rstudioapi_0.8         curl_3.2               png_0.1-7              lsei_1.2-0             statmod_1.4.30         tibble_1.4.2          
[103] geneplotter_1.58.0     stringi_1.2.4          lattice_0.20-35        trimcluster_0.1-2.1    pillar_1.3.0           Rdpack_0.10-1         
[109] lmtest_0.9-36          data.table_1.11.8      bitops_1.0-6           irlba_2.3.2            gbRd_0.4-11            R6_2.3.0              
[115] latticeExtra_0.6-28    KernSmooth_2.23-15     gridExtra_2.3          codetools_0.2-15       MASS_7.3-50            gtools_3.8.1          
[121] assertthat_0.2.0       rprojroot_1.3-2        withr_2.1.2            GenomeInfoDbData_1.1.0 hms_0.4.2              diptest_0.75-7        
[127] doSNOW_1.0.16          grid_3.5.1             rpart_4.1-13           tidyr_0.8.2            class_7.3-14           rmarkdown_1.10        
[133] segmented_0.5-3.0      Rtsne_0.15             base64enc_0.1-3
single cell R scrna-seq • 383 views
ADD COMMENTlink modified 9 weeks ago • written 4 months ago by Kristin Muench410
1
gravatar for Kristin Muench
9 weeks ago by
United States
Kristin Muench410 wrote:

Update from the future: I did end up running these scripts on a computational cluster using a job scheduler. Once I was able to run these scripts with between 64-128 GB of RAM, I no longer received this issue, and the scripts ran as expected.

To run an R script via command line instead of via RStudio, I use RScript: https://support.rstudio.com/hc/en-us/articles/218012917-How-to-run-R-scripts-from-the-command-line

ADD COMMENTlink written 9 weeks ago by Kristin Muench410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 656 users visited in the last hour