Tool: Snakemake--Python-Based Workflow Management
14
gravatar for Sean Davis
4.0 years ago by
Sean Davis23k
National Institutes of Health, Bethesda, MD
Sean Davis23k wrote:

From the snakemake website:

Build systems like GNU Make are frequently used to create complicated workflows, e.g. in bioinformatics. This project aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style:

rule targets:
    input:  'plots/dataset1.pdf', 
            'plots/dataset2.pdf' 
rule plot:
    input:  'raw/{dataset}.csv'
    output: 'plots/{dataset}.pdf'
    shell:  'somecommand {input} {output}'

Like with GNU Make, in Snakemake you first specify targets in terms of a pseudo rule, and then how they are created via one or more steps of subsequent rule applications. Rules can be generalized via wildcards (here {dataset}). Everything is propagated top-down, i.e. here Snakemake determines that for the file "plots/dataset1.pdf" the rule plot has to be applied with wildcard {dataset} = dataset1 to the file raw/dataset1.csv. How the files are created is specified either with a shell command or python code. Further, Snakemake can interface with R to specify R code inside rules. Also see the FAQ to get an impression of the basic idea behind Snakemake.

python workflow tool • 5.5k views
ADD COMMENTlink written 4.0 years ago by Sean Davis23k
1

Simple introduction, with example/comparison of the same pipeline made with both perl and snakemake here: https://bitbucket.org/johanneskoester/snakemake/wiki/Getting%20Started%20with%20Snakemake%20and%20Qsub This tool is excellent.

ADD REPLYlink written 4.0 years ago by Endre Bakken Stovner80
1

Too bad it only works with python 3. Does anyone have a solution to make it work without having to get python 3?

ADD REPLYlink written 4.0 years ago by UnivStudent360
1

I don't have a solution to that, but a very easy way to install python 3 locally without needing root access, is the pyenv program. 

After installing pyenv, you can install it with e.g. (given that 3.4.1 is the version you want to install):

pyenv install 3.4.1

... and then activate that version in your local shell:

pyenv shell 3.4.1

... or globally with:

pyenv global 3.4.1
ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by Samuel Lampa1.1k

a snakemake solution here: http://coderscrowd.com/app/codes/view/192

ADD REPLYlink written 4.0 years ago by Jeremy Leipzig17k

Too bad it only works with python 3. Does anyone have a solution to make it work without having to get python 3?

ADD REPLYlink written 4.0 years ago by UnivStudent360
1

Unfortunately, python 2 is not possible because of missing functionality in the multiprocessing module. However, having a python 3 setup in your home directory is very easy with virtualenv, or even without, using ~/.local as a prefix.

ADD REPLYlink written 4.0 years ago by johannes.koester10
1

An example of how to do this might be worth a FAQ entry since I think python3 is one of the main stumbling blocks to greater uptake.

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by Sean Davis23k
2

I have added the virtualenv setup to the documentation

ADD REPLYlink written 3.7 years ago by Jeremy Leipzig17k
1

Cool, I would propose adding documentation for setup with pyenv, since it takes away a lot of the complexity with the virtualenv / virtualenvwrapper / virtualenv-burrito mess.

ADD REPLYlink written 2.7 years ago by Samuel Lampa1.1k
1

Anaconda is an excellent and completely free Python distribution. It installs python 3 with all those sometimes hard to install mathematical tools like numpy, scipy, matplotlib and a few others.

*Installs cleanly into a single directory

*Doesn’t require root or local admin privileges

*Doesn’t affect other Python installs on your system, or interfere with OS X Frameworks

https://store.continuum.io/cshop/anaconda/

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by Endre Bakken Stovner80

Great! I'll have to try that out

ADD REPLYlink written 4.0 years ago by UnivStudent360

What's wrong with the good old robust GNU Make? Looks I have to type less there, while having the same functionality? Any main advantages of snakemake that I miss?

ADD REPLYlink written 4.0 years ago by Christian2.6k
2

Much easier to code, able to easily use python (and all the libraries included) in the makefile (but also shell scripting), possible with several input output files for each rule (even if the exact number not known in advance), and is easy to generalize so it works with many different datasets: https://bitbucket.org/johanneskoester/snakemake/wiki/Documentation#markdown-header-wildcards

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by Endre Bakken Stovner80
1

Snakemake is designed to work in a cluster environment. I routinely run snakemake over 3000 cores.

ADD REPLYlink written 4.0 years ago by Sean Davis23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1527 users visited in the last hour