Question

Sbatch slurm job

0

Entering edit mode

8 weeks ago

marco.barr ▴ 80

Hello everyone, I'm having trouble launching this script in sbatch Slurm from a Linux cluster. Once I launch the sbatch command followed by the script path and check with squeue, I don't get any job ID or .err file in the specified folder. I've double-checked the paths multiple times, and they are correct. I don't understand why, since it has always worked before. I hope you can help me. Thank you.

#!/bin/bash 
#SBATCH --job-name=trimming
#SBATCH --mem=64GB  # amout of RAM in MB required (and max ram available).
##SBATCH --mem-per-cpu=5000 # amount of ram per Core (see ntasks, if you ask for ntasks
#SBATCH --time=INFINITE  ## OR #SBATCH --time=10:00 means 10 minutes OR --time=01:00:00 means 1 hour
#SBATCH --ntasks=10  # number of required cores
#SBATCH --nodes=1  # not really useful for not mpi jobs
##SBATCH --partition=work  ##work is the default and unique queue, you do not need to specify.
#SBATCH --error="/home/barresi.m/RNAseq/RNAseq11/RNAseq_ERR/trimming.err"
#SBATCH --output="/home/barresi.m/RNAseq/RNAseq11/RNAseq_OUT/trimming.out"

source /opt/common/tools/besta/miniconda3/bin/activate
conda activate aligners

for i in $(cat /home/barresi.m/RNAseq/RNAseq11/patients_list1.txt) 
do trimmomatic PE -threads 6 -phred33 \
    /home/barresi.m/RNAseq/RNAseq11/Fastq/$i\_R1.fastq.gz \
    /home/barresi.m/RNAseq/RNAseq11/Fastq/$i\_R2.fastq.gz \
    /home/barresi.m/RNAseq/RNAseq11/Trimming/$i\_R1_paired.fq.gz \
    /home/barresi.m/RNAseq/RNAseq11/Trimming/$i\_R1_unpaired.fq.gz \
    /home/barresi.m/RNAseq/RNAseq11/Trimming/$i\_R2_paired.fq.gz \
    /home/barresi.m/RNAseq/RNAseq11/Trimming/$i\_R2_unpaired.fq.gz \
    ILLUMINACLIP:/datasets/adapters/trimmomatic/NexteraPE-PE.fa:2:30:10 \
    TRAILING:20 \
    MINLEN:30; \
done

cluster sbatch slurm • 557 views

ADD COMMENT • link 8 weeks ago by marco.barr ▴ 80

1

Entering edit mode

This would be a question better addressed to your local HPC team, as none of us are likely to know the particulars of your specific cluster.

I see in your script that you've commented out the --partition flag. At least on my cluster, that would lead to job submission failures, but I can't say if that's the case for your system.

ADD REPLY • link 8 weeks ago by Dave Carlson ★ 1.7k

1

Entering edit mode

Also, there's an extra \ before the done - not sure if that could be interfering with the script, but often times these things lead to invisible failures.

ADD REPLY • link 8 weeks ago by Ram 43k

0

Entering edit mode

My guess is that the double hash in line four messes things up. SLURM is picky with these header lines. Try removing it.

ADD REPLY • link 8 weeks ago by ATpoint 82k

score 2 · Accepted Answer · 2024-02-29

2

Entering edit mode

8 weeks ago

marco.barr ▴ 80

thanks everyone, I contacted the cluster server administrator and the problem was in the communication of the ports so not in itself linked to the script as I thought. I ran the script both as I had originally written and following your advice and it works.

ADD COMMENT • link 8 weeks ago by marco.barr ▴ 80

Ram · Accepted Answer · 2024-02-27

You should always get some output from sbatch. Either:

> sbatch testscript.sh
Submitted batch job 61234567

or an error message, e.g.:

sbatch badjob.sh
sbatch: error: Batch job submission failed: No partition specified or system default partition

If you don't see any output in the output/error files, this may be because the directory does not exist, and slurm doesn't create it. If you don't see the jobid in squeue, this may be due to the job already being terminated.

To get more robust output, define output and error files like so:

#SBATCH --output=job.o%j # Name of stdout output file
#SBATCH --error=job.e%j  # Name of stderr error file

Then they will be created in the working directory and carry the jobid as suffix. While debugging, you can better use srun:

 srun -N 1 -p mypartition -n 1 -t 0:10:00  ./testscript.sh