Dear all,
I apologise in advance if this issue has been addressed before and I just did not find it in here. Also, I am not a bioinformatician/programmer/computer person at all...
I have been trying to analyse my data set from a CUT&RUN experiment using the cutruntools pipeline (following this: https://bitbucket.org/qzhudfci/cutruntools/src/master/USAGE.md). Everything was ok (after a lot of work), I was able to validate my json file and to create a set of Slurm job-submission scripts based on said configuration file. However...when I set to start the actual analysis using sbatch I get a "command not found" message...and I am stuck...I do not know what to do (I have asked around, no one was able to help). I am using the terminal in a macOS laptop.
I put in this in my cd: sbatch path-to-this-sh-file/integrated.sh path-to-FASTA-seq-file/FILE.fastq.zip I get this in return: zsh: command not found: sbatch
Any advice/tip/previous experience with this is more than welcome.
Thank you!
Kind regards,
Nadia
sbatch
is a slurm command meant to be used on HPC cluster, not on a macOS machine.Hi Ram,
Thank you for your answer. I know, but I asked about this same issue to one of the people from the cut&run lab and they just commented on installing slurm (which is already there).
Is there anyway to modify this to use it in non-cluster setting?
Thank you.
Are you sure SLURM is working on your Mac?
You can find the actual scripts on this page. You will need to find the right sequence looking at the content of
integrated.sh
(main script) and then run the steps manually.Hi GenoMax,
No, I am not, I have installed SLURM and tried some things with someone who knows more about this, and had no luck. I have read SLURM no longer runs on macOS, although I am not sure how true this might be.
Thank you for the "main script" hint...I have been staring at that bitbucket page for days-weeks trying to do it manually (I want to understand what I am doing), but -as you may have noticed- I am not an experienced programmer/computer person.
Thank you!
No worries. You are making an effort is the main thing. So setting SLURM aside these are the steps you will need to run.
Some of the things below look hard coded for paths (e.g.
/home/qz64/kseq_test
) internal to the system that coder is using so you will need to change the location to your own.They are defining a lot of system folder paths as variables at top of the script.
So for example if you want to work in a directory called
/path_to/my_data
then that would be set asbase
variable by doingbase=/path_to/my_data
. Script then refers to this location by using$base
name.Once these steps complete you will need to go to the next one in series. Looks like a total of 5 scripts
Thank you :)
Yes, the entire pipeline is based on internal paths. I was extremely happy when I stopped getting errors from the validate.py step and managed to get the config.json file adapted to my data and folders, which is why it was so frustrating not being able to run the analysis because of the sbatch issue :(
I did the analysis first going manually step by step, so I used trimmomatic, aligned with bowtie2 (with dovetail and without it, it worked better with it) and got stuck upon peak calling with MACS2 because the reads were different lengths (I have paired-end data)...that is when I decided to give the cutruntool pipeline a try. Apart from the kseq -which I did not know the commands or what it did exactly so I could not use it- I did those steps (your last comment actually brings a little validation to my past trial and a bit of relief). I think I will go back to those files and work on them following your suggestions.
Thank you!!