It takes about 15 hours for my paired end human genome to be trimmed. Although there are ways to parallel trim_galore for multiple sample, could you suggest method for single sample?
TrimGalore! is a wrapper around cutadapt. You could therefore you cutadapt directly, which has a multithreading option, and in case pigz is in PATH it will use this for (de)compression rather than default gzip. That all will speed-up things. pigz is a multithreaded version of gzip.
I move my answer to comment as this is not possible within a given nextflow pipeline without modifications.
You could split your input file into multiple pieces and trim those in parallel.
Or you can use a multi-threaded trimming program like bbduk.sh to speed the process up. A guide for bbduk is available.
You could just split your fastq files into multiple chunks (assuming you know the total number of reads) and then run multiple trim_galore commands.
You could try this if you have seqkit:
seqkit split2 -1 reads_1.fq.gz -2 reads_2.fq.gz -p 2 -O out
Or normally (you have to re-gzip them at the end):
zcat XXX.recal.fastq.gz | split -l 4000000 - prefix
I'm using nf-core/sarek and it has an option for splitting, so will try it out.
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy