If someone has already manage to run cellRanger with Slurm, maybe you can help me :
Until now, I was running CellRanger on a cluster, on a single node of 512Go RAM.
For a dataset of 6,7k cells and 90k reads/cell,
cellranger count function takes 7h30.
I can dispose of 5 nodes of 512Go RAM.
I tried the slurm template proposed by 10x (even if it's not officially support), jobs are submited by
#!/usr/bin/env bash #SBATCH -J __MRO_JOB_NAME__ #SBATCH -p big #SBATCH --export=ALL #SBATCH --nodes=1 --ntasks-per-node=__MRO_THREADS__ #SBATCH --signal=2 #SBATCH --no-requeue ### Alternatively: --ntasks=1 --cpus-per-task=__MRO_THREADS__ ### Consult with your cluster administrators to find the combination that ### works best for single-node, multi-threaded applications on your system. #SBATCH --mem=__MRO_MEM_GB__G #SBATCH -o __MRO_STDOUT__ #SBATCH -e __MRO_STDERR__ __MRO_CMD__
When I run the same
count function on the same dataset but with slurm template as following :
cellranger count --transcriptome=refdata-cellranger-mm10-3.0.0 --fastqs=./indepth_C07_MissingLibrary_1_HL5G3BBXX, ./indepth_C07_MissingLibrary_1_HNNWNBBXX --jobmode=./martian-cs/v3.2.3/jobmanagers/slurm.template`
I check the jobs submited : Martian submit 64 jobs by 64 jobs to the cluster (to slurm) and one job by node is running (this is how the cluster work, I can submit only one job by node because it uses all CPUs of the node = 16 CPUs).
So instead of having 1 node busy, I parralelize on 5 nodes. BUT :
it takes 12h42 instead of 7h30.
I checked the processus running and number of CPUs used : When I use a single node, the 1st process called
read_chunks use 1-4CPUs, the 2nd process
python (don't know what it is doing ?) use 16CPUs so all CPUs.
With parallelization on 5 nodes :
read_chunks takes 1-4CPUs and
python ONLY ONE CPU on each node instead of 16 CPUs. I guess that's why it takes so long !!
Is it because Slurm is not officially support ?
Do you think I can modify something in the template to change that ?