Entering edit mode
7 months ago
bestone
▴
30
Hello everyone, there are 32 different apricot genotypes, and I want to analyze their whole genome sequence. I tried to write a bash script code for this, but I could not succeed. I named all the raw data as Genotype1.R1.fg.gz, Genotype1.R2.fg.gz, Genotype2.R1.fg.gz, Genotype2.R2.fg.gz.... 32 of them. Instead of analyzing them one by one. Can you help me write one code and do all the analysis? With the analysis I will do: BWA, SAMTOOLS, PICARD, GATK
Note: I tried this code but it didn't work
This is a SLURM jobscript. The obvious error I can see now is that the variable
$sample_name_genom
is not defined and that you are not using the$REFSEQ
variable properly to pass the reference to all programs. Further:--filter-name$
doesn't look good, and the name should precede the expression (I think).To spot these mistakes automatically, I recommend to always add the
set -eu
directive as the line after the #SBATCH parameter block. There are other things that need to be optimized, but for further help, we need the exact error messages that occur, not just 'didn't work'.you'd better learn how to use a workflow manager like nextflow or snakemake instead of using bash loops + slurm.
Or use array jobs, or submit jobs in a loop. I don't see any problems in using loops. I use snakemake on slurm and still like bash loops and use them to submit actual snakemake jobs that run on slurm nodes and trigger child jobs on other slurm nodes. It doesn't have to be one or the other.
IMO OP is starting just now and will evolve to a place where they will need workflow management tools. They should be made aware that tools exist and they can use them along with other tools they're familiar with instead of asking them to replace what they know with something entirely new.
I run it but it says:
That looks like a problem with your cluster setup and you could not run any job. Try to make a simple job script first and debug that until it is running properly. I noticed there may be a few parameters required for a SLURM job that could be missing from your script (partition to use, account name, time). Find some local support to figure this out. We cannot do this from here because every cluster configuration is slightly different. Also, some variables can be set via the environment, I believe the error is coming from some setting you haven't shown.
What happens for example if you do:
srun -N 1 -t 0:01:00 bash -c "uname -a"
Thank you for replying. I changed it but still couldn't figure it out. I added the last code that I run it
That doesn't answer my question. I first wanted to make sure that you can run any job at all. Also, please remember to always post the error message. I am unable to read your mind (un-)fortunately.
You write this "I noticed there may be a few parameters required for a SLURM job that could be missing from your script (partition to use, account name, time)." but when you look at my bash you will see them they are not missing and also If you read the messages I sent more carefully, you would see that I wrote the error code at the top. Thank you for replying.
What "error code at the top" are you talking about? Error code in response to the
srun -N 1 -t 0:01:00 bash -c "uname -a"
command? Also, Michael clearly asks you to find local support for that (possible) problem because no one outside your cluster admin/users can help you with things specific to your cluster. You seem to be addressing that problem and not the one that he said he can help you with.