Are there any working GNU parallel or similar Shell code examples on how to run Pilon, Prokka, VT and Snpeff tools on a batch of input files?
1
1
Entering edit mode
4.1 years ago

Are there any working GNU parallel or similar Shell code examples on how to run Pilon, Prokka, VT and Snpeff tools on a batch of input files?

gnu shell parallel prokka snpeff • 1.5k views
0
Entering edit mode

Hi england_bioinformatics_team,

The manual of gnu parallel is quite extensive, have you had a look at that? See also this post: Gnu Parallel - Parallelize Serial Command Line Programs Without Changing Them

Cheers,
Wouter

0
Entering edit mode

while gnu parallel is super powerful to parallelize commands, what you might be looking for is something even more powerful like snakemake, which is designed to run workflows

1
Entering edit mode
4.1 years ago
Joe 20k

Can you be more specific?

What kind of files/what file structure?

The format for parallel is pretty much the same for whatever you want to do:

parallel 'mycommand {} {.}.ext' ::: inputfile(s)


The command is simply whatever you would normally invoke the program by. Here's a real example of running a python script:

parallel --gnu 'radar.py -a {} > {.}.radar' ::: *.faa


the {} simply means an input file, and {.} means the input filename with the extension stripped off, so that you can then append your own to identify your output files.

alternatively, you can pipe it from ls. e.g.

ls *.gz | parallel --gnu gunzip &


Just be careful to manage the threads you launch, as most of those tools will also support a threading option. If you invoked as many parallel jobs as you have cores, and then the process you invoke tries to multithread as well, you can quickly cause CPU thrashing, and the job won't end up running any faster.