Forum: Snakemake vs. Nextflow: strengths and weaknesses
9
gravatar for ropolocan
4 months ago by
ropolocan270
Canada
ropolocan270 wrote:

I have seen increasing interest in workflow/pipeline management systems such as snakemake and nextflow. In my opinion, both seem very interesting and very promising. There is a very interesting review from 2016 in which bash, make, snakemake and nextflow were compared: https://www.jmazz.me/blog/NGS-Workflows

The author of that review did a very good of analyzing the strengths and weaknesses of snakemake and nextflow. I am not sure how much has changed since then, but in your experience, what would be some criteria that bioinformaticians could consider to choose one over the other? Have some of the identified weaknesses of both snakemake and nextflow have been addressed since then?

nextflow snakemake forum • 1.5k views
ADD COMMENTlink modified 4 months ago by Sinji2.5k • written 4 months ago by ropolocan270
1

And BioMake is off the game? It uses prolog (which is both the weakness and the strength...).

ADD REPLYlink written 4 months ago by kamiljaron70

Hello, @kamiljaron. I was not aware of BioMake; I would have to read up on it. I do not know prolog, nor have I ever used a logical programming language, but I will read more about what BioMake has to offer.

ADD REPLYlink written 4 months ago by ropolocan270
1

I started using snakemake 6 months ago, and now I have shifted all my pipelines to snakemake (ChIP-seq, RNA-seq, ATAC-seq and DNA-seq). I am pretty happy with it. once you get the idea of how snakemake works (think in a bottom-up fashion), it is easy to build up your own pipelines. BTW, the documentation is awesome.

you can write a customer script for submitting jobs to the cluster for each platform (LSF, moab...) if you want more control of your jobs. e.g. https://bitbucket.org/snakemake/snakemake/issues/28/clustering-jobs-with-snakemake

only downside for me is that when I have more than 1000 jobs to submit, it takes time for snakemake to process the metadata associated with each job. For a dry-run, it takes minutes. I do not know how fast nextflow is.

ADD REPLYlink written 4 months ago by tangming20052.2k
4
gravatar for dariober
4 months ago by
dariober8.1k
Glasgow - UK
dariober8.1k wrote:

Besides what a tool can or cannot do I like to check the quality of the documentation, whether it is actively developed and maintained, how many developers contribute to it, and size of the user base.

It seems to me that snakemake and nextflow are pretty much on a draw for all these metrics and both are pretty good (although in terms of user base and developers they are far from tools like luigi). So I think it's a difficult choice between these two...

I haven't tried nextflow, but recently I started working with snakemake and I'm very happy with it. Actually I feel dumb that for years I've been hacking together bash scripts to run pipelines. For me one advantage of snakemake is that a snakemake script is effectively python with additional features on top. So if you know python, putting some complex logic and functions in a snakemake script is straightforward. I guess the same applies to nexflow but using groovy, which is not so popular though.

From the review you link it seems nextflow doesn't have a "dry run" option. I find dryrun to be super useful to see what would be executed and for developing and debugging is great.

Just my 2p...

ADD COMMENTlink written 4 months ago by dariober8.1k
1

About the dry run option, if I am not wrong, nextflow does not have it because it does not know a priori what will be the exact execution dag. Nextflow language is more expressive and the execution dag may depends on the input data if you have conditional executions in your workflow for example (which is not possible in Snakemake I think?)

ADD REPLYlink written 4 months ago by Fred650

Thank you very much for your answer, @dariober.

It seems to me that snakemake and nextflow are pretty much on a draw for all these metrics and both are pretty good (although in terms of user base and developers they are far from tools like luigi). So I think it's a difficult choice between these two...

I am curious about luigi. I have read many good comments about it, and I will be looking into testing it as well. I was testing Snakemake and I can see why it has garnered attention.

Actually I feel dumb that for years I've been hacking together bash scripts to run pipelines. For me one advantage of snakemake is that a snakemake script is effectively python with additional features on top. So if you know python, putting some complex logic and functions in a snakemake script is straightforward. I guess the same applies to nexflow but using groovy, which is not so popular though.

Using snakemake was kind of an "eureka" moment for me as well. It has so much potential, and I look forward to adapt other pipelines I had written on bash or python to snakemake.

ADD REPLYlink written 4 months ago by ropolocan270
4
gravatar for Sinji
4 months ago by
Sinji2.5k
UT Southwestern Medical Center
Sinji2.5k wrote:

I'm a big fan of Nextflow. I've used Snakemake in the past, and it was originally my go-to workflow language, but the built in support for Docker, Singularity, and HPC environments that Nextflow provides just can't be beat.

The only downside is you have to use Groovy.

ADD COMMENTlink modified 4 months ago • written 4 months ago by Sinji2.5k

Thank you very much for your answer, @Sinji. I also look forward to test Nextflow. Both workflow systems/languages have so much potential. I think they could make a very important impact on bionformatics.

ADD REPLYlink written 4 months ago by ropolocan270
3
gravatar for shenwei356
4 months ago by
shenwei3563.4k
China
shenwei3563.4k wrote:

Table 1: Comparison of Nextflow with other workflow management systems

Workflow Nextflow Galaxy Toil Snakemake Bpipe
Platforma Groovy/JVM Python Python Python Groovy/JVM
Native task supportb Yes (any) No No Yes (BASH only) Yes (BASH only)
Common workflow languagec No Yes Yes No No
Streaming processingd Yes No No No No
Dynamic branch evaluation Yes ? Yes Yes Undocumented
Code sharing integratione Yes No No No No
Workflow modulesf No Yes Yes Yes Yes
Workflow versioningg Yes Yes No No No
Automatic error failoverh Yes No Yes No No
Graphical user interfacei No Yes No No No
DAG renderingj Yes Yes Yes Yes Yes
Container management
Docker supportk Yes Yes Yes No No
Singularity supportl Yes No No No No
Multi-scale containersm Yes Yes Yes No No
Built-in batch schedulersn
Univa Grid Engine Yes Yes Yes Partial Yes
PBS/Torque Yes Yes No Partial Yes
LSF Yes Yes No Partial Yes
SLURM Yes Yes Yes Partial No
HTCondor Yes Yes No Partial No
Built-in distributed clustero
Apache Ignite Yes No No No No
Apache Spark No No Yes No No
Kubernetes Yes No No No No
Apache Mesos No No Yes No No
Built-in cloudp
AWS (Amazon Web Services) Yes Yes Yes No No

ADD COMMENTlink modified 4 months ago • written 4 months ago by shenwei3563.4k
3

To be fair it would be nice to see the same table compiled or commented by the authors of snakemake... With respect to slurm, I don't know what is meant by "partial" support in snakemake. I started playing with snakemake and running jobs using slurm is incredibly simple.

ADD REPLYlink written 4 months ago by dariober8.1k
3

Yeah, snakemake has full support for anything that uses drmaa, which I expect is also what Galaxy uses and probably what nextflow uses. Further, the footnote in the table for that section basically amounts to, "Actually, it has full support for these and any future schedulers, you just have to tell it how to execute the commands." I prefer the snakemake way of doing this, since everyone submits jobs through a wrapper I wrote and that way lots of things (temp space, memory usage, queue, etc.) can be conveniently set without including them again and again in snakemake files.

ADD REPLYlink modified 4 months ago • written 4 months ago by Devon Ryan71k
1

Thanks for sharing this table, @shenwei356! It is very interesting to see that nextflow has stream processing, workflow versioning, and full support for SLURM, in addition to having native task support for any language. I believe snakemake has native task support for R now as well. Thanks again for your answer.

ADD REPLYlink written 4 months ago by ropolocan270
1
gravatar for ropolocan
4 months ago by
ropolocan270
Canada
ropolocan270 wrote:

I just read this excellent review by @Jeremy Leipzig. This article can be helpful for deciding which workflow management system is more suitable to each one's needs: https://academic.oup.com/bib/article-lookup/doi/10.1093/bib/bbw020

ADD COMMENTlink modified 4 months ago • written 4 months ago by ropolocan270
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 763 users visited in the last hour