Choosing the right sbatch parameters
Entering edit mode
2.4 years ago
drabiza1 ▴ 20

First time using SLURM and Sbatch on computer cluster. I'm trying to use it to process 100 RNA-seq samples. From what I understand the best way to go about it is submitting 100 different Sbatch jobs. What I' not sure about is how to select the correct parameters of sbatch run: nodes and ntasks, ntasks-per-node

The pipeline I'm running is essentially Hisat2-->Samtools-->StringTie2

When I used AWS - I ran everything with 64 threads. I'm not sure what that translates to in terms of nodes and ntasks.

The cluster I'm using is affiliated with the university so I do not want to request more than what I need but I'm trying to get this work done as soon as possible and have a reasonable budget to spend on this project.

Any advice is appreciated. Thank you

Sbatch SLURM Nstasks Nodes • 1.4k views
Entering edit mode
2.4 years ago
Mensur Dlakic ★ 27k

In simple terms, individual computers in a cluster are nodes. They usually share a disk system, and a head node controls how their jobs are assigned. Most of the time you don't get to choose which exact node will run your job, though you can choose which group of nodes will run your jobs if you are a member of multiple groups. Another way of "choosing" your node is to specify a job configuration that can run only on certain nodes, but that way you are limiting your resources.

Continuing with the same logic, threads of individual computers are tasks. If you ask for a single node and 64 tasks (basically you want your job to run with 64 threads; --nodes=1 --ntasks=64), and none of your nodes have more than 40 threads, your job will never run because you have specified a configuration that can't be executed. If you specify --nodes=2 --ntasks=64, your job will run on one node until all of its tasks are occupied (say, 40), and the remaining tasks (say, 24) will run on a second node. If you specify --nodes=2 --ntasks=64 --ntasks-per-node=32, your job will be evenly split so that each of the two nodes starts 32 threads.

I am almost certain you will get different suggestions from what I will propose below, and you should weigh all of them in consultation with your HPC administrator.

I never run jobs over multiple nodes unless it is absolutely necessary - and even then I don't do it. Scheduling jobs over multiple nodes will typically make your jobs wait longer, because the job is more complex and not as easy to schedule. I am talking here about a reasonably busy cluster - it will make no difference if a small fraction of nodes are running jobs. I run jobs on a single node and ntasks that is never greater than the number of threads per node. That means your job runs on 40 instead of your desired 64 threads which will make it slower, but it will increase the likelihood that your job will be scheduled immediately instead of waiting for multiple nodes with a specified number of tasks to become free. This also means that you can start a larger number of simultaneous jobs.

To sum: I suggest you find out how many nodes are in your cluster, and the number of threads per node. Then I suggest you submit your jobs like so:

#SBATCH --nodes=1
#SBATCH --ntasks=N

where N =< threads per node. In my experience this is the fastest way to get multiple jobs done. The only time I have benefitted from using --nodes > 1 was for single jobs that can run on a larger number of threads than what is available on individual nodes. Even in a case like this I sometimes waited 1-2 days for multiple nodes to come completely free for my job. Waiting a long time may negate the gain from having a job run on a large number of threads, and in my experience it is not worth it unless the job is truly long.

Entering edit mode

Thank you this explanation it was very helpful. I seem to be having an issue with running out of memory. To avoid running out of disk space I've been generating all the sam/bam files in the scratch folder - which I think has worked. But I still seem to be encountering another error during alignment and I believe it is because I am exceeding RAM when several samples are being run at the same time. Here is the report I see:

State: COMPLETED (exit code 0) Nodes: 1 Cores per node: 24 CPU Utilized: 06:36:25 CPU Efficiency: 31.31% of 21:06:00 core-walltime Job Wall-clock time: 00:52:45 Memory Utilized: 19.12 GB Memory Efficiency: 16.60% of 115.22 GB

My memory efficiency is always around ~15% so I'm not sure if why I'm running into this issue. I've been using --nodes=1 --ntasks=24 --partition=shared for the 100 samples on individual sbatch jobs. Do you think shared partition has something to do with it?

The reasons I believe the issues is RAM are based on the Hisat2 error I receive, which Ive seen before when RAM was an issue and because there is no problem when I run the samples not concurrently.

Do you have any suggestions?

Entering edit mode
2.4 years ago
GenoMax 141k

Answer to this is going to be somewhat dependent on hardware you have access to and how your SLURM instance in set up.

If you are planning to run parallel jobs (there are fancier job arrays etc but i prefer simple option) then selecting 8 cores per job along with an adequate amount of RAM (20-30G, I don't use HISAT but that should be in the ball park for a human genome) would be a great start. Be sure to ask for adequate time (default time may be set up for 2h or less). Princeton has nice help page that describes SLURM options for their hardware. You should be able to extrapolate that to suite local needs. Even if you submit all 100 jobs not all of them will start running. On a well managed cluster your account will have access to a certain amount of resources (CPU, memory etc) so a subset of submitted jobs should start running until you use up your allocation. Rest will then PEND and start as original jobs finish.


Login before adding your answer.

Traffic: 1435 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6