Question: Running assemblies in parallel
3
gravatar for marit.hetland
21 months ago by
marit.hetland30 wrote:

Hello,

I have 600+ bacterial isolates (*fastq.gz files) that I want to assemble.

I have a script which uses a for loop to trim adapters and run SPAdes on each isolate, but this takes a very long time, and my computer should be powerful enough to run the script for multiple isolates at one time, which is what I want to do.

I have looked at GNU parallel which seems to work well and be faster than the for loop. As an example, I do the trimming like this:

parallel 'trimmomatic PE {}R1*.f*q.gz {}R2*.f*q.gz {}pair_R1.fq.gz {}unpair_R1.fq.gz {}pair_R2.fq.gz {}unpair_R2.fq.gz ILLUMINACLIP:NexteraPE-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36' ::: $(ls *.fastq.gz | rev | cut -c 16- | rev | uniq)

My question is; When running parallel in a folder containing my 600+ filepairs, will parallel try to run the program on all 600+ isolates simultaneously or will it limit the amount of files run at one time based on how much the computer can manage at one time? Or is there a way to limit how many files parallel should work on at one time, apart from specifying the specific files?

Thank you!

parallel fastq assembly • 759 views
ADD COMMENTlink modified 21 months ago by Joe14k • written 21 months ago by marit.hetland30
3
gravatar for Joe
21 months ago by
Joe14k
United Kingdom
Joe14k wrote:

Parallel will run as many as it can concurrently* (i.e the number of cores you have, unless you specify otherwise with -j. There is also a "max load" flag you can use in the manual to manage the load on the system.

*Edit, the caveat of this being that you're running/invoking enough commands. Obviously if you have 32 cores, but only 16 files to work on, only 16 threads will be spawned.

ADD COMMENTlink modified 21 months ago • written 21 months ago by Joe14k
2
gravatar for WouterDeCoster
21 months ago by
Belgium
WouterDeCoster42k wrote:

If you would have a look at the parallel documentation, you would find the -j argument, which you can use to limit the number of jobs. There are probably more options to do similar things.

ADD COMMENTlink written 21 months ago by WouterDeCoster42k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 967 users visited in the last hour