Question: Running assemblies in parallel
gravatar for marit.hetland
2.9 years ago by
marit.hetland40 wrote:


I have 600+ bacterial isolates (*fastq.gz files) that I want to assemble.

I have a script which uses a for loop to trim adapters and run SPAdes on each isolate, but this takes a very long time, and my computer should be powerful enough to run the script for multiple isolates at one time, which is what I want to do.

I have looked at GNU parallel which seems to work well and be faster than the for loop. As an example, I do the trimming like this:

parallel 'trimmomatic PE {}R1*.f*q.gz {}R2*.f*q.gz {}pair_R1.fq.gz {}unpair_R1.fq.gz {}pair_R2.fq.gz {}unpair_R2.fq.gz ILLUMINACLIP:NexteraPE-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36' ::: $(ls *.fastq.gz | rev | cut -c 16- | rev | uniq)

My question is; When running parallel in a folder containing my 600+ filepairs, will parallel try to run the program on all 600+ isolates simultaneously or will it limit the amount of files run at one time based on how much the computer can manage at one time? Or is there a way to limit how many files parallel should work on at one time, apart from specifying the specific files?

Thank you!

parallel fastq assembly • 1.2k views
ADD COMMENTlink modified 2.9 years ago by Joe18k • written 2.9 years ago by marit.hetland40
gravatar for Joe
2.9 years ago by
United Kingdom
Joe18k wrote:

Parallel will run as many as it can concurrently* (i.e the number of cores you have, unless you specify otherwise with -j. There is also a "max load" flag you can use in the manual to manage the load on the system.

*Edit, the caveat of this being that you're running/invoking enough commands. Obviously if you have 32 cores, but only 16 files to work on, only 16 threads will be spawned.

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Joe18k
gravatar for WouterDeCoster
2.9 years ago by
WouterDeCoster45k wrote:

If you would have a look at the parallel documentation, you would find the -j argument, which you can use to limit the number of jobs. There are probably more options to do similar things.

ADD COMMENTlink written 2.9 years ago by WouterDeCoster45k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1513 users visited in the last hour