Closed:How to run simultaneous jobs in parallel using Job arrays in SLURM?
0
0
Entering edit mode
6.0 years ago
Biologist ▴ 290

Dear all,

I need help in running simultaneous jobs parallel on SLURM. I'm very new in running array jobs and working on SLURM. I have around 100 tar.gz files. I would like to unit them and use the fastq's for alignment with hisat2 exporting bam, sorting the bam files and finally exporting the output as sorted.bam files.

tar.gz -> fastq (after extraction) -> bam -> sorted.bam

I made a script for this like below to run on SLURM cluster.

#!/bin/bash

#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=4G
#SBATCH --time=05:59:59
#SBATCH --tmp=500G
#SBATCH --array=1-100%20

mkdir /home/destination
cd /home/destination

for i in /home/eg/*.tar.gz
do
tar xvzf $i -C $TMPDIR
for sample in $TMPDIR/*1.fastq
do
dir2="/home/destination"
base=$(basename $sample "_1.fastq")
base2=$(basename $i ".tar.gz")
module load HISAT2/2.0.4-goolf-1.7.20; module load SAMtools/1.3.1-goolf-1.7.20; hisat2 -p 8 --dta --rna-strandness RF -x /home/grch38_snp_tran/genome_snp_tran -1 $TMPDIR/${base}_1.fastq -2 $TMPDIR/${base}_2.fastq | samtools view -Sb - > $TMPDIR/${base2}.bam; samtools sort -T $TMPDIR/${base2}.sorted -o ${dir2}/${base2}.sorted.bam $TMPDIR/${base2}.bam
done
done

With this the jobs started running, but even after finishing one job the same job is repeating again. Do I need to specify "$SLURM_ARRAY_TASK_ID"? How to do that for the above code?

And also how to get .out files for each array ID index?

Any help is appreciated.

slurm parallel alignment fastq bam • 268 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 3137 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6