I am in the process of cobbling together a NGS pipeline for trimming, assembling, and annotating genomes for our group and I also have very limited bioinformatics experience. I would like to be able to run several SPAdes jobs at once. Thankfully, I have access to a cluster that uses SGE. However, I have been unable to set-up an array job with SPAdes. The SPAdes command line interface keeps trying to read the index variable needed for SGE to run an array as part of the file name, which it then can't find. I tried concatenating my files and having one entry per line, but that also has failed. Any suggestions so as to avoid writing the same thing over and over again on the command line with only slight changes in the input file name?
Thanks!
Actually, I don't know for sure that I am NOT doing that. I am following a template for arrays suggested by our sys admin. I have tried 2 different approaches.
1.
2.
Shouldn't you need:
Also, this is the code in the script. I am more concerned where
${SGE_TASK_ID}
gets its value from. What is the command you use to submit the job?For example, your command should look something like:
and then you can access the numbers 1 thru 15 using the array variable
$SGE_TASK_ID