RnBeads SLURM submission problem.
1
0
Entering edit mode
6 months ago
Yuna • 0

Hi, I have tried to run my RnBeads run using HPC (SLURM) and run into some issues. I suspect it may be related to the memory issue since our hpc one node has only 4GB RAM. To overcome this, I tried to set the rnb.cr <- setModuleResourceRequirements(rnb.cr,c(mem="32G"),"all") but it still showed the issues. I think I may need to do something more for allocating RAM to multiple nodes for total 32GB RAM. But, I couldn't find any script using SLURM yet. It would be nice if I can get any help since I am trying to import my cov files for 4 days and all of my trials were failed so far. Many thanks!!!

I did this: * xml file *

<rnb.xml>
<analysis.name>Methylation_Seq</analysis.name>
<data.source>data/Methylation_cov/dataset/coverage,data/Methylation_cov/dataset/sample_annotation.csv</data.source>
<dir.reports>reports</dir.reports>
<data.type>bed.dir</data.type>
<assembly>hg38</assembly>
<region.types>tiling,genes,promoters,cpgislands</region.types>
<identifiers.column>filename_bed</identifiers.column>
<colors.category>#1B9E77,#D95F02,#7570B3,#E7298A,#66A61E,#E6AB02,#A6761D,#666666,#2166AC,#B2182B</colors.category>
<min.group.size>2</min.group.size>
<max.group.count>20</max.group.count>
<import.bed.style>bismarkCov</import.bed.style>
<qc.coverage.plots>true</qc.coverage.plots>
<filtering.sex.chromosomes.removal>true</filtering.sex.chromosomes.removal>
<filtering.missing.value.quantile>0.5</filtering.missing.value.quantile>
<filtering.coverage.threshold>5</filtering.coverage.threshold>
<filtering.low.coverage.masking>true</filtering.low.coverage.masking>
<filtering.high.coverage.outliers>true</filtering.high.coverage.outliers>
<filtering.greedycut>false</filtering.greedycut>
<exploratory.columns>Sample_ID,CellType,Sample_Group</exploratory.columns>
<exploratory.intersample>false</exploratory.intersample>
<exploratory.region.profiles>genes,promoters</exploratory.region.profiles>
<differential.site.test.method>limma</differential.site.test.method>
<differential.comparison.columns>CellType,Sample_Group</differential.comparison.columns>
<differential.enrichment>true</differential.enrichment>
<export.to.trackhub>bigBed</export.to.trackhub>
<logging.memory>true</logging.memory>
<disk.dump.big.matrices>true</disk.dump.big.matrices>
<enforce.memory.management>true</enforce.memory.management>
<gz.large.files>true</gz.large.files>
</rnb.xml>

Execute codes in R

library(RnBeads) 
xml.file <- "submit.xml"
arch <- new("ClusterArchitectureSLURM")
rnb.cr <- new("RnBClusterRun",arch)
rnb.cr <- setModuleResourceRequirements(rnb.cr,c(mem="32G"),"all")
rnb.cr <- setModuleNumCores(rnb.cr,4L,"all")
rnb.cr <- setModuleNumCores(rnb.cr,2L,"exploratory")
run(rnb.cr, "rnbeads_analysis", xml.file) 

* This is the import.log file *

2021-05-25 22:50:15     1.5  STATUS ...Started module: import
2021-05-25 22:50:15     1.5  STATUS STARTED Configuring Analysis
2021-05-25 22:50:15     1.5 WARNING     The option 'differential.enrichment' no longer exists. Note, that RnBeads now supports GO and LOLA enrichment. Your option setting will be applied to the new option 'differential.enrichment.go'
2021-05-25 22:50:15     1.5 WARNING     The option 'differential.enrichment' no longer exists. Note, that RnBeads now supports GO and LOLA enrichment. Your option setting will be applied to the new option 'differential.enrichment.go'
2021-05-25 22:50:15     1.5    INFO     Machine name: hpc-d36-5-4.local
2021-05-25 22:50:15     1.5  STATUS     STARTED Setting up Multicore
2021-05-25 22:50:15     1.5    INFO         Using 4 cores
2021-05-25 22:50:15     1.5  STATUS     COMPLETED Setting up Multicore
2021-05-25 22:50:15     1.5    INFO     Analysis Title: Methylation_Seq
2021-05-25 22:50:15     1.5    INFO     Number of cores: 4
2021-05-25 22:50:15     1.5  STATUS COMPLETED Configuring Analysis

2021-05-25 22:50:15     1.5  STATUS STARTED Loading Data
2021-05-25 22:50:15     1.5    INFO     Number of cores: 4
2021-05-25 22:50:15     1.5    INFO     Loading data of type "bed.dir"
2021-05-25 22:50:15     1.5  STATUS     STARTED Performing loading test
2021-05-25 22:50:15     1.5    INFO         The first 10000 rows will be read from each data file
2021-05-25 22:50:15     1.5    INFO         No column with file names specified: will try to find one
2021-05-25 22:50:15     1.5  STATUS         STARTED Loading Data From BED Files
2021-05-25 22:50:19     1.6  STATUS             STARTED Automatically parsing the provided sample annotation file
2021-05-25 22:50:19     1.6  STATUS                 Potential file names found in column 1 of the supplied annotation table
2021-05-25 22:50:19     1.6  STATUS             COMPLETED Automatically parsing the provided sample annotation file
2021-05-25 22:50:19     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/HDF1_1_bismark.cov
2021-05-25 22:50:33     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/HDF1_2_bismark.cov
2021-05-25 22:50:47     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/HDF2_1_bismark.cov
2021-05-25 22:51:00     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/HDF2_2_bismark.cov
2021-05-25 22:51:14     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/HDF3_1_bismark.cov
2021-05-25 22:51:28     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/HDF3_2_bismark.cov
2021-05-25 22:51:41     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/HDFPA_1_bismark.cov
2021-05-25 22:51:55     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/HDFPA_2_bismark.cov
2021-05-25 22:52:08     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/KPA_1_bismark.cov
2021-05-25 22:52:22     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/KPA_2_bismark.cov
2021-05-25 22:52:36     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P5_1_bismark.cov
2021-05-25 22:52:49     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P5_2_bismark.cov
2021-05-25 22:53:03     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P16_1_bismark.cov
2021-05-25 22:53:16     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P16_2_bismark.cov
2021-05-25 22:53:30     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P17_1_bismark.cov
2021-05-25 22:53:44     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P17_2_bismark.cov
2021-05-25 22:53:58     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P21_1_bismark.cov
2021-05-25 22:54:11     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P21_2_bismark.cov
2021-05-25 22:54:25     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P24_1_bismark.cov
2021-05-25 22:54:38     1.6    INFO             Reading BED file: data/Methylation_cov/dataset/coverage/P24_2_bismark.cov
2021-05-25 22:54:51     1.6  STATUS             Read 20 BED files
2021-05-25 22:54:52     1.6  STATUS             Matched chromosomes and strands to annotation
2021-05-25 22:54:52     1.6  STATUS             Checked for the presence of sites and coverage
2021-05-25 22:54:52     1.6  STATUS             Initialized meth/covg matrices
opening ff /tmp/Rtmpt5pObc/ff/ff2f4c7d78799e.ff
2021-05-25 22:55:06     1.6  STATUS             Combined a data matrix with 14820 sites and 20 samples
2021-05-25 22:55:06     1.6  STATUS             Processed all BED files
2021-05-25 22:55:19     1.6  STATUS             STARTED Creating RnBiseqSet object
2021-05-25 22:55:19     1.6    INFO                 Removed 14820 sites with unknown chromosomes
2021-05-25 22:55:19     1.6 WARNING                 All sites have been removed, returning NULL
2021-05-25 22:55:19     1.6  STATUS             COMPLETED Creating RnBiseqSet object
2021-05-25 22:55:19     1.6  STATUS         COMPLETED Loading Data From BED Files
2021-05-25 22:55:19     1.6  STATUS         STARTED Checking the loaded object
2021-05-25 22:55:19     1.6    INFO             The supplied object is not of class RnBiseqSet. Breaking the check...
2021-05-25 22:55:19     1.6 WARNING             The object loaded during the loading test contains invalid information (see details above). Please check the whether the data source arguments as well as the data import options, like table separator, BED style or BED column assignment, are set correctly
2021-05-25 22:55:19     1.6  STATUS         COMPLETED Checking the loaded object
2021-05-25 22:55:32     1.6  STATUS     COMPLETED Performing loading test
2021-05-25 22:55:32     1.6    INFO     No column with file names specified: will try to find one
2021-05-25 22:55:32     1.6  STATUS     STARTED Loading Data From BED Files
2021-05-25 22:55:32     1.6  STATUS         STARTED Automatically parsing the provided sample annotation file
2021-05-25 22:55:32     1.6  STATUS             Potential file names found in column 1 of the supplied annotation table
2021-05-25 22:55:32     1.6  STATUS         COMPLETED Automatically parsing the provided sample annotation file
2021-05-25 22:55:32     1.6    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/HDF1_1_bismark.cov
/var/spool/slurmd/job1756391/slurm_script: line 4: 12108 Killed                  Rscript /gpfs/home/ys16b/R/x86_64-redhat-linux-gnu-library/4.0/RnBeads/extdata/Rscript/rscript_import.R -x submit.xml -o reports/cluster_run -c 4
slurmstepd: error: Detected 1 oom-kill event(s) in step 1756391.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
RnBeads memory SLURM • 389 views
ADD COMMENT
0
Entering edit mode

So, I tried to change the rnb.cr memory submission option from mem="32G" to mem.size="32G" after reading the source code in RnBeads package. And, it seems to work way longer than previous run but it crashed again. I tried to post to the RnBeads package github page as well so I will update the results if I solve this issue. Thanks!

This is what I did:

in R

library(RnBeads) 
xml.file <- "submit.xml"
arch <- new("ClusterArchitectureSLURM")
rnb.cr <- new("RnBClusterRun",arch)
rnb.cr <- setModuleResourceRequirements(rnb.cr,c(mem.size="32G"),"all")
rnb.cr <- setModuleNumCores(rnb.cr,4L,"all")
rnb.cr <- setModuleNumCores(rnb.cr,2L,"exploratory")
run(rnb.cr, "rnbeads_analysis", xml.file) 

And, this is the new error message:

2021-05-26 12:17:58     1.6  STATUS         COMPLETED Automatically parsing the provided sample annotation file
2021-05-26 12:17:58     1.6    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/HDF1_1_bismark.cov
2021-05-26 12:26:25     2.4    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/HDF1_2_bismark.cov
2021-05-26 12:33:59     2.9    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/HDF2_1_bismark.cov
2021-05-26 12:42:48     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/HDF2_2_bismark.cov
2021-05-26 12:49:06     2.8    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/HDF3_1_bismark.cov
2021-05-26 12:58:07     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/HDF3_2_bismark.cov
2021-05-26 13:06:58     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/HDFPA_1_bismark.cov
2021-05-26 13:15:57     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/HDFPA_2_bismark.cov
2021-05-26 13:24:12     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/KPA_1_bismark.cov
2021-05-26 13:33:03     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/KPA_2_bismark.cov
2021-05-26 13:41:11     2.9    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/P5_1_bismark.cov
2021-05-26 13:50:12     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/P5_2_bismark.cov
2021-05-26 13:58:57     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/P16_1_bismark.cov
2021-05-26 14:07:30     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/P16_2_bismark.cov
2021-05-26 14:16:13     3.0    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/P17_1_bismark.cov
2021-05-26 14:24:47     2.9    INFO         Reading BED file: data/Methylation_cov/dataset/coverage/P17_2_bismark.cov

 *** caught bus error ***
address 0x2b856937c000, cause 'non-existent physical address'

Traceback:
 1: `[<-.ff`(`*tmp*`, i2, value = c(0, 0, 0, 0, 25, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0$
 2: `[<-`(`*tmp*`, i2, value = c(0, 0, 0, 0, 25, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0$
 3: `[<-.ffdf`(`*tmp*`, i, , value = list(V1 = c(177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L$
 4: `[<-`(`*tmp*`, i, , value = list(V1 = c(177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177L, 177$
 5: read.table.ffdf(FUN = "read.delim", ...)
 6: read.delim.ffdf(x = NULL, file = file, next.rows = 10000L, ...,     colClasses = columnClasses, header = FALSE)
 7: FUN(X[[i]], ...)
 8: lapply(file.names, read.single.bed, context = "cg", ..., skip = skip.lines,     ffread = useff)
 9: lapply(file.names, read.single.bed, context = "cg", ..., skip = skip.lines,     ffread = useff)
10: read.bed.files(base.dir = data.source[[1]], sample.sheet = data.source[[2]],     file.names.col = filename.column, verbose = verbose, skip.lines $
11: rnb.execute.import(data.source, data.type)
12: rnb.step.import(data.source, data.type, report)
13: rnb.run.import(data.source, data.type, report.dir)
An irrecoverable exception occurred. R is aborting now ...
/var/spool/slurmd/job1764089/slurm_script: line 4: 24168 Bus error      
ADD REPLY
0
Entering edit mode

Bus error indicates that your job ran out of memory. You may want to work with your local sys admins since it appears that your job is running ok. You just need to find the right combination of SLURM options for it.

ADD REPLY
0
Entering edit mode
6 months ago
Ido Tamir 5.2k

The only relevant line that you posted is: slurmstepd: error: Detected 1 oom-kill event(s) in step 1756391.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

You did not post the important lines of how you submitted your job. To reserve 32Gb on a node with slurm you have to submit with

--mem 32G

If this is not possible because your nodes have less ram, you will see the job rejected when you do sj [jobnumber]

sinfo -a --Node --long

will tell you how much memory the nodes have in principle. how much you can use, you can find out yourself after reading the manual.

ADD COMMENT
0
Entering edit mode

Hi, thank you for the post.

RnBeads has its own script to submit the information to the SLURM. This is why it is hard to figure out which part has an issue. I did rnb.cr <- setModuleResourceRequirements(rnb.cr,c(mem="32G"),"all") to make my --mem 32G but apparently it didn't work that way since it crashed. This is their architectureSLURM.R file.

################################################################################
# Cluster Architecture Descriptions
################################################################################
################################################################################
# Concrete implementations for the SLURM environment
################################################################################

#' ClusterArchitectureSLURM Class
#'
#' A child class of \code{\linkS4class{ClusterArchitecture}} implementing specifications of Simple Linux Utility for Resource Management (SLURM) architectures.
#'
#' @details
#' Follow this template if you want to create your own ClusterArchitecture class.
#'
#' @section Slots:
#' see \code{\linkS4class{ClusterArchitecture}}
#'
#' @section Methods:
#' \describe{
#'   \item{\code{\link{getSubCmdTokens,ClusterArchitectureSGE-method}}}{Returns a vector of command line tokens corresponding to submitting
#'   a job with the given command to the cluster}
#' }
#'
#' @name ClusterArchitectureSLURM-class
#' @rdname ClusterArchitectureSLURM-class
#' @author Michael Scherer
#' @exportClass ClusterArchitecture
setClass("ClusterArchitectureSLURM",
    contains = "ClusterArchitecture"
)

#' initialize.ClusterArchitectureSLURM
#'
#' Initialize an ClusterArchitecture object for a SLURM
#' 
#' @param .Object New instance of \code{ClusterArchitectureSLURM}.
#' @param name A name or identifier
#' @param ... arguments passed on to the constructor of \code{\linkS4class{ClusterArchitecture}} (the parent class)
#'
#' @export
#' @author Michael Scherer
#' @docType methods
setMethod("initialize","ClusterArchitectureSLURM",
    function(
        .Object,
        name="ClusterArchitectureSLURM",
        ...
    ) {
        .Object <- callNextMethod(.Object=.Object, name=name, ...)
        .Object <- setExecutable(.Object,"R","R")
        .Object <- setExecutable(.Object,"Rscript","Rscript")
        .Object <- setExecutable(.Object,"python","python")
        .Object@getSubCmdTokens.optional.args <- c("sub.binary","quote.cmd")
        .Object
    }
)

#' getSubCmdTokens-methods
#'
#' Returns a string for the of command line corresponding to submitting
#' a job with the given command to the cluster.
#' @details
#' For a concrete child class implementation for a SLURM architecture specification see \code{\linkS4class{ClusterArchitectureSLURM}}
#'
#' @param object \code{\linkS4class{ClusterArchitectureSLURM}} object
#' @param cmd.tokens a character vector specifying the executable command that should be wrapped in the cluster submission command
#' @param log file name and path of the log file that the submitted job writes to
#' @param job.name name of the submitted job
#' @param res.req named vector of requested resources. Two options are available: \code{"clock.limit"} and \code{"memory.size"}
#' @param sub.binary flag indicating if the command is to be submitted using the \code{"wrap"} option of SLURM
#' @param depend.jobs character vector containg names or ids of jobs the submitted job will depend on.
#' @param quote.cmd Flag indicating whether the submitted cammed should also be wrapped in quotes
#' @return A character vector containing the submission command tokens
#'
#' @rdname getSubCmdTokens-ClusterArchitectureSLURM-methods
#' @docType methods
#' @aliases getSubCmdTokens,ClusterArchitectureSLURM-method
#' @author Michael Scherer
#' @export
#' @examples
#' \donttest{
#' arch <- new("ClusterArchitectureSLURM",
#'  name="my_slurm_architecture"
#' )
#' getSubCmdTokens(arch,c("Rscript","my_great_script.R"),"my_logfile.log")
#' }
setMethod("getSubCmdTokens",
    signature(
        object="ClusterArchitectureSLURM"
    ),
    function(
      object,
      cmd.tokens,
      log,
      job.name = "",
      res.req = character(0),
      depend.jobs = character(0),
      sub.binary = TRUE,
      quote.cmd = TRUE
    ) {
      res.req.token <- NULL
        if(length(res.req)>0){
          if("clock.limit" %in% names(res.req)){
            res.req.token <- paste(res.req.token,"-t",res.req["clock.limit"]," ",collapse = "")
          }
          if("mem.size" %in% names(res.req)){
            res.req.token <- paste0(res.req.token,"--mem=",res.req["mem.size"],collapse="")

          }
        }
        log.token <- NULL
        if (nchar(log)>0) {
            log.token <- c("-o",log)
        }
        job.name.token <- NULL
        if (nchar(job.name)>0) {
            job.name.token <- paste0(job.name.token,"--job-name=",job.name,collapse = "")
        }
        dependency.token <- NULL
        if (length(depend.jobs)>0){
          get.job.id <- function(x){
            cmd <- paste0("echo $(squeue --noheader --format %i --name ",x,")")
            system(cmd,intern = T)
          }
          depend.jobs <- sapply(depend.jobs,get.job.id)
            dependency.token <- paste0(dependency.token, "--depend=", paste0(paste(depend.jobs,collapse=",")),collapse = "")
        }
        wrap.token <- NULL
        if(sub.binary){
          if (quote.cmd){
            cmd.tokens <- paste0("--wrap=",paste0("'",paste(cmd.tokens,collapse=" "),"'"),collapse="")
          }else{
            cmd.tokens <- paste0("--wrap=",paste(cmd.tokens,collapse=" "),collapse="")
          }
        }else{
          if (quote.cmd){
            cmd.tokens <- paste0("'",paste(cmd.tokens,collapse=" "),"'")
          }else{
            cmd.tokens <- paste(cmd.tokens,collapse=" ")
          }
        }

        res <- c(
            "sbatch",
            "--export=ALL",
            res.req.token,
            log.token,
            job.name.token,
            dependency.token,
            wrap.token,
            cmd.tokens
        )
        return(res)
    }
)
ADD REPLY

Login before adding your answer.

Traffic: 2835 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6