Question: grid engine problem with OMA
0
gravatar for andrespara
12 months ago by
andrespara0
Chile
andrespara0 wrote:

Dear all,

Using the open grid engine and using this line

 qsub -b y -j y -t 1-40 -cwd /usr/local/OMA/bin/OMA

I got this error

Starting database conversion and checks... We require that job-arrays now explicitly specify the number of jobs in the array. You should add to your submission script an environment variable "NR_PROCESSES" that holds the total number of jobs you use. Example: in bash: export NR_PROCESSES=100 in tcsh: setenv NR_PROCESSES=100 ERROR: require NR_PROCESSES to be assigned to an environment variable

I used this line

export NR_PROCESSES=100

but it keeps failing

Previous version of OMA have worked with our setup but now the process starts, qstat throws the jobs assigned for a few seconds and then all the processes vanish.

Using OMA 2.1.1 and grid engine GE 6.2u5, Ubuntu 14.04

EDITED TO ADD

I also would like to know if OMA 2+ has been tested on open grid engine or if there is feedback other users on this set up. Would it be better to change from open grid to Slurm for using OMA 2+?

Thanks for your help

grid engine oma cluster • 621 views
ADD COMMENTlink modified 12 months ago by adrian.altenhoff440 • written 12 months ago by andrespara0

See if something in this past thread helps: Failure to launch OMA in array mode on SLURM cluster

ADD REPLYlink written 12 months ago by genomax59k

It didn't help, the variable is set up. "echo "NR_PROCESSES $NR_PROCESSES" shows 100

ADD REPLYlink modified 12 months ago • written 12 months ago by andrespara0

Tagging: adrian.altenhoff

ADD REPLYlink written 12 months ago by genomax59k

I also would like to know if OMA 2+ has been tested on open grid engine or if there is feedback other users on this set up or scenario. Would it be better to change from open grid to Slurm for using OMA 2+?

ADD REPLYlink modified 12 months ago • written 12 months ago by andrespara0
2
gravatar for alex.wv
12 months ago by
alex.wv20
United Kingdom
alex.wv20 wrote:

Hi, I'm a member of the group that develops OMA.

I think the problem here is that the environment variable (NR_PROCESSES) is being set locally, however qsub doesn't copy the environment variables set on the submission node to the worker nodes by default.

The -V option would copy all environment variables, so that your command listed above would become

export NR_PROCESSES=100
qsub -V -b y -j y -t 1-40 -cwd /usr/local/OMA/bin/OMA

There is also the short-hand (using -v), of

qsub -v NR_PROCESSES=100 -b y -j y -t 1-40 -cwd /usr/local/OMA/bin/OMA

if you only need to set NR_PROCESSES.

The command qstat -j <JOB_ID> lists the environment variables to copy to the worker nodes, so that you can verify it's set.

Best wishes, Alex

ADD COMMENTlink written 12 months ago by alex.wv20

Thanks for the help Alex! It worked!!

ADD REPLYlink written 12 months ago by andrespara0

Hi again, I wonder if I could send the jobs to only certain nodes so I can start multiple OMA runs with different datasets in the same grid. We have several nodes called "ubuntu-node2" "ubuntu-node3" and so on. I am currently launching OMA with that line qsub -v NR_PROCESSES=100 -b y -j y -t 1-40 -cwd /usr/local/OMA/bin/OMA Is there any parameter I can add? Sorry to ask this but I have no experience with the grid or these kind of architecture. Let me know if I should make this a post for more visibility. Thanks for your help.

ADD REPLYlink written 8 months ago by andrespara0

Hi, seems like you could specify the hostnames with -l, see for example answer in to a similar question in here: https://stackoverflow.com/questions/19635895/run-a-job-on-all-nodes-of-sun-grid-engine-cluster-only-once

ADD REPLYlink written 8 months ago by adrian.altenhoff440

Thanks! I will try this

ADD REPLYlink written 8 months ago by andrespara0
1
gravatar for adrian.altenhoff
12 months ago by
Switzerland
adrian.altenhoff440 wrote:

Hi,

I think this problem is related to the way you submit the job and the specific configurations of the SGE system. I suggest you try to launch the job with a submission script, e.g.

cat > ./start-oma.sh << EOF
#!/bin/bash
#$ -S /bin/bash
# Request ten minutes of wallclock time
#$ -l h_rt=0:10:0
# Request 2 gigabyte of RAM.
#$ -l h_vmem=2G,tmem=2G
# Set up the job array, e.g. 3 tasks
#$ -t 1-3
# Set the name of the job
#$ -N oma
#$ -cwd

# Run the application.
export NR_PROCESSES=3
/usr/local/OMA/bin/OMA
EOF

The resulting submission script can then be used for submitting to the cluster

qsub start-oma.sh

At least this setup seems to work for me. Your above submission also results in an error for me, so I think chances are high this will work.

ADD COMMENTlink written 12 months ago by adrian.altenhoff440

A collaborator beat me and run Alex's solution first at the grid but I will check the script with your solution as soon as the current run ends. Thanks for your help Adrian.

ADD REPLYlink written 12 months ago by andrespara0

When I try to execute "qsub start-oma.sh" it says

qsub: Unknown option

bash start-oma.sh

works but starts only one process and with qstat I didn't notice any activity but maybe it is part of the intention of the script.

ADD REPLYlink written 12 months ago by andrespara0

you might need to give the full path to the oma-start script, i.e.

qsub ./oma-start.sh

and to make the script executable (chmod +x oma-start.sh) but anyways, if Alex's variant works that's perfect.

ADD REPLYlink modified 12 months ago • written 12 months ago by adrian.altenhoff440
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1581 users visited in the last hour