grid engine problem with OMA
2
0
Entering edit mode
6.4 years ago
andrespara ▴ 30

Dear all,

Using the open grid engine and using this line

 qsub -b y -j y -t 1-40 -cwd /usr/local/OMA/bin/OMA

I got this error

Starting database conversion and checks... We require that job-arrays now explicitly specify the number of jobs in the array. You should add to your submission script an environment variable "NR_PROCESSES" that holds the total number of jobs you use. Example: in bash: export NR_PROCESSES=100 in tcsh: setenv NR_PROCESSES=100 ERROR: require NR_PROCESSES to be assigned to an environment variable

I used this line

export NR_PROCESSES=100

but it keeps failing

Previous version of OMA have worked with our setup but now the process starts, qstat throws the jobs assigned for a few seconds and then all the processes vanish.

Using OMA 2.1.1 and grid engine GE 6.2u5, Ubuntu 14.04

EDITED TO ADD

I also would like to know if OMA 2+ has been tested on open grid engine or if there is feedback other users on this set up. Would it be better to change from open grid to Slurm for using OMA 2+?

Thanks for your help

oma grid engine cluster • 2.8k views
ADD COMMENT
0
Entering edit mode

See if something in this past thread helps: Failure to launch OMA in array mode on SLURM cluster

ADD REPLY
0
Entering edit mode

It didn't help, the variable is set up. "echo "NR_PROCESSES $NR_PROCESSES" shows 100

ADD REPLY
0
Entering edit mode

Tagging: adrian.altenhoff

ADD REPLY
0
Entering edit mode

I also would like to know if OMA 2+ has been tested on open grid engine or if there is feedback other users on this set up or scenario. Would it be better to change from open grid to Slurm for using OMA 2+?

ADD REPLY
4
Entering edit mode
6.4 years ago
alex.wv ▴ 50

Hi, I'm a member of the group that develops OMA.

I think the problem here is that the environment variable (NR_PROCESSES) is being set locally, however qsub doesn't copy the environment variables set on the submission node to the worker nodes by default.

The -V option would copy all environment variables, so that your command listed above would become

export NR_PROCESSES=100
qsub -V -b y -j y -t 1-40 -cwd /usr/local/OMA/bin/OMA

There is also the short-hand (using -v), of

qsub -v NR_PROCESSES=100 -b y -j y -t 1-40 -cwd /usr/local/OMA/bin/OMA

if you only need to set NR_PROCESSES.

The command qstat -j <JOB_ID> lists the environment variables to copy to the worker nodes, so that you can verify it's set.

Best wishes, Alex

ADD COMMENT
0
Entering edit mode

Thanks for the help Alex! It worked!!

ADD REPLY
0
Entering edit mode

Hi again, I wonder if I could send the jobs to only certain nodes so I can start multiple OMA runs with different datasets in the same grid. We have several nodes called "ubuntu-node2" "ubuntu-node3" and so on. I am currently launching OMA with that line qsub -v NR_PROCESSES=100 -b y -j y -t 1-40 -cwd /usr/local/OMA/bin/OMA Is there any parameter I can add? Sorry to ask this but I have no experience with the grid or these kind of architecture. Let me know if I should make this a post for more visibility. Thanks for your help.

ADD REPLY
0
Entering edit mode

Hi, seems like you could specify the hostnames with -l, see for example answer in to a similar question in here: https://stackoverflow.com/questions/19635895/run-a-job-on-all-nodes-of-sun-grid-engine-cluster-only-once

ADD REPLY
0
Entering edit mode

Thanks! I will try this

ADD REPLY
2
Entering edit mode
6.4 years ago

Hi,

I think this problem is related to the way you submit the job and the specific configurations of the SGE system. I suggest you try to launch the job with a submission script, e.g.

cat > ./start-oma.sh << EOF
#!/bin/bash
#$ -S /bin/bash
# Request ten minutes of wallclock time
#$ -l h_rt=0:10:0
# Request 2 gigabyte of RAM.
#$ -l h_vmem=2G,tmem=2G
# Set up the job array, e.g. 3 tasks
#$ -t 1-3
# Set the name of the job
#$ -N oma
#$ -cwd

# Run the application.
export NR_PROCESSES=3
/usr/local/OMA/bin/OMA
EOF

The resulting submission script can then be used for submitting to the cluster

qsub start-oma.sh

At least this setup seems to work for me. Your above submission also results in an error for me, so I think chances are high this will work.

ADD COMMENT
0
Entering edit mode

A collaborator beat me and run Alex's solution first at the grid but I will check the script with your solution as soon as the current run ends. Thanks for your help Adrian.

ADD REPLY
0
Entering edit mode

When I try to execute "qsub start-oma.sh" it says

qsub: Unknown option

bash start-oma.sh

works but starts only one process and with qstat I didn't notice any activity but maybe it is part of the intention of the script.

ADD REPLY
0
Entering edit mode

you might need to give the full path to the oma-start script, i.e.

qsub ./oma-start.sh

and to make the script executable (chmod +x oma-start.sh) but anyways, if Alex's variant works that's perfect.

ADD REPLY

Login before adding your answer.

Traffic: 1411 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6