Question: Benchmarking bioinformatics workflow with common patterns and motifs?
gravatar for Samuel Lampa
4.4 years ago by
Samuel Lampa1.2k
Samuel Lampa1.2k wrote:

Is anyone aware of a good example workflow that incorporates some of the most common workflow motifs or patters, in bioinformatics workflows, that could be used to try out various workflow engines and see how easy it is to encode the workflow in them?

Some examples of what I mean with patterns:

  • File split / Process / File join
  • Map / Reduce
  • Parameter sweep scatter/gather
  • Nested parameter sweeps
  • Cross-validation fold-generation
  • Nested parameter sweeps and fold-generation
  • Multiple inputs and outputs of processes
  • Replicate fixed output file pattern (when a tool's output file path can't be changed)
  • Unknown number of outputs (such as line-count based file split with unknown file size, or reading row-by-row from a database)
  • Conditional execution based on a web service call
  • ... (fill in) ...

? And if not, what is the interest in crowd-sourcing such an example?

pipelines workflows • 1.1k views
ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by Samuel Lampa1.2k

Hypothetical workflows are not as powerful as actual workflows -- I would perhaps take actual workflows like the GATK best practices, or a WGBS workflow, and implement that in different workflow designers. It is far more likely to get crowd-sourced help too, since the results will be directly applicable to many people.

ADD REPLYlink written 4.4 years ago by John12k

Indeed. I'm looking for something as realistic as possible, and simple concrete examples would be optimal.

I just want to make sure that most relevant patterns, or "type examples" are covered. E.g. the above examples are some motifs that we have came across in our own work with applying machine learning in drug discovery, which is not always found in more sequencing-centric examples.

But I'd like to have an example, or a small set of examples covering as wide a spectrum of type examples as possible, from both genomics, machine learning, and any other common sub-area.

ADD REPLYlink modified 4.4 years ago • written 4.4 years ago by Samuel Lampa1.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1462 users visited in the last hour