I am creating sorted bam
files from multiple paired-end fastq
files using an array along with the parallel
command. I am not specifying job numbers for parallel
. I thought parallel
should make the job faster but it takes longer to finish the job with parallel as compared to array without parallel. Any help is appreciated to understand why it is so?
I think this has been covered here already. I'm no parallel expert myself but from what I understood the thing is that you should not do parallel in combination with a for loop. You either use the for loop and do then serial or you stream your input files to parallel and do them all in parallel.
At some point you simply run out of RAM or I/O capacity on your system. Parallel will start multiple jobs but it can't overcome limitation of hardware you have available.