Question: Failure to launch OMA in array mode on SLURM cluster
0
gravatar for jeremias.br
14 months ago by
jeremias.br10
jeremias.br10 wrote:

I am working on a CentOS 7.3 cluster running Slurm and I am using OMA 2.1.1. Unfortunately I am not able to run OMA in array mode. I am working with the included files in the ToyExample directory.

Here is the SLURM script I have run:

#!/bin/bash
#SBATCH --time=6:00:00
#SBATCH --job-name="toy_test"
export NR_PROCESSES=2
$HOME/soft/OMA.2.1.1/bin/oma

I launched the script with this command as per the instructions on the OMA site:

sbatch --array=1-2 -N1 toy_launch.sh

Jobs fails with this error:

Starting database conversion and checks...

ERROR: failed to parse anything

When removing the "export NR_PROCESSES=2" part of the call OMA does launch but assumes it is not launched as an array job.

Starting database conversion and checks...

WARNING: not run as a job-array. Will assume it is a single process

This biostar issue seems relevant but has a different error message. I attempted to implement the fix presented in the mentioned issue but it seems there have been some major changes to the lib/Platforms file.

slurm oma orthologs • 630 views
ADD COMMENTlink modified 13 months ago • written 14 months ago by jeremias.br10
2
gravatar for adrian.altenhoff
13 months ago by
Switzerland
adrian.altenhoff440 wrote:

Hi Jeremias,

I'm one of the OMA developers. This problem is quite strange to me - it's likely to be a problem of the slurm configuration that is different than the once we've so far tested. I would like to understand the problem a bit better to improve slurm support for OMA standalone. There two things I suggest to do:

  1. try to set NR_PROCESSES outside your launch script, so before you do the sbatch command, just export the NR_PROCESSES there.
  2. if it still fails, could you check to what the following environment variables are set in the job. For that, just change the call to oma with the following lines in your script file:

    echo "NR_PROCESSES $NR_PROCESSES"

    echo "SLURM_ARRAY_JOB_ID $SLURM_ARRAY_JOB_ID"

    echo "SLURM_ARRAY_TASK_ID $SLURM_ARRAY_TASK_ID"

    echo "SLURM_ARRAY_TASK_MAX $SLURM_ARRAY_TASK_MAX"

Thanks already for reporting.

Best wishes Adrian

ADD COMMENTlink written 13 months ago by adrian.altenhoff440
0
gravatar for jeremias.br
13 months ago by
jeremias.br10
jeremias.br10 wrote:

Hi Adrian,

Thanks for your reply.

1. exporting in the shell used to submit the job did not work. OMA still thinks it is running as a single process.

2. I did two runs one where I included the "export NR_PROCESSES=2" command in the script and one without.

with the export command:

Starting database conversion and checks...

ERROR: failed to parse anything

NR_PROCESSES 2

SLURM_ARRAY_JOB_ID 13799019

SLURM_ARRAY_TASK_ID 1

SLURM_ARRAY_TASK_MAX 2

without the export command it again works as a single process:

Starting database conversion and checks...

WARNING: not run as a job-array. Will assume it is a single process

[...]

Done!!

NR_PROCESSES

SLURM_ARRAY_JOB_ID 13799064

SLURM_ARRAY_TASK_ID 1

SLURM_ARRAY_TASK_MAX 2

ADD COMMENTlink modified 13 months ago • written 13 months ago by jeremias.br10

so strange! what version of slurm are you using? the cluster I have access to and where the ToyExample with your submission script works like a charm is a RedHat-Enterprise 6.7 installation with slurm 14.11.10. (you can get the slurm version with slurmctld -V)

ADD REPLYlink written 13 months ago by adrian.altenhoff440

I'm using slurm 17.02.3 on CentOS 7.3.1611.

ADD REPLYlink modified 13 months ago • written 13 months ago by jeremias.br10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1490 users visited in the last hour