r is running out of memory
1
0
Entering edit mode
8 weeks ago
jenyfer • 0

enter image description hereWhen i am trying loading gene expression counts downloaded from tcga (I have 859 samples with raw counts of 60000 rows). I GET THIS ERROR and R STOPPED FOR NO REASON! I check the memory to find out that my memory is nearly full.BUT i have 16GB RAM.

    counts_df <- counts_files %>% 
  lapply(function(x) {
    tmp <- read_tsv(x, col_names = F) %>% 
      purrr::set_names("gene_id", basename(x))
    cat(which(counts_files == x), "of", length(counts_files), "\n")
    return(tmp)
    }) %>%
  reduce(function(x, y) full_join(x, y, by = "gene_id")) %>% 
  dplyr::select(gene_id, metadata$file_name) %>% 
  set_names("gene_id", metadata$TCGA_id_full) %>% 
  dplyr::slice(1:(nrow(.)-5))
TCGA MEMORY R • 290 views
ADD COMMENT
2
Entering edit mode
8 weeks ago

It seems that R is using ~12GB of RAM. The remainder will be used up by Windows and associated processes.

You may consider one or more of these options:

  1. restart the computer and run just R / RStudio (and nothing else)
  2. avoid using RStudio - it uses up an unnecessary amount of extra RAM
  3. avoid using the dplyr functions - one has greater control over the data flow with base R functions
  4. avoid using %>%; instead, divide your code into different sections. You can use rm() and gc() between each section to remove unneeded objects and clear memory, respectively.
  5. rent an Amazon EC2 instance that has more RAM
  6. pre-filter the input object - most of those genes will have just 0 counts across all samples
  7. import the data via data.table::fread()

If you want further assistance on improving the code, then please provide the following:

  • a link to the data that you are using
  • a sample of the data pasted here
  • a sample of how you want the data to appear after all processing, i.e., desired output

Kevin

ADD COMMENT
0
Entering edit mode

Thanks a lot for foryour kindness advise. I processed this on my old computer i bought 6 years ago with 8GB RAM. SAME code SAME pakages. and it produce results of 60000row and 789columns data。 However i wil still try your advise latter. thanks

ADD REPLY

Login before adding your answer.

Traffic: 1088 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6