I am trying to run clumpify (from the BBTools package) in order to deduplicate reads from multiple compressed Fastq PE files. Is that possible with clumpify, without first concatenating all files? So far I've tried:
clumpify.sh in=L1_R1.fq.gz,L2_R1.fq.gz in2=L1_R2.fq.gz,L2_R2.fq.gz out=dd_R1.fq.gz out2=dd_R2.fq.gz ziplevel=2 dedupe=t
but this resulted in an error - looks like the "," syntax is not supported here. I also tried:
clumpify.sh in=<(zcat L1_R1.fq.gz L2_R1.fq.gz) in2=(zcat L1_R2.fq.gz L2_R2.fq.gz) out=dd_R1.fq.gz out2=dd_R2.fq.gz ziplevel=2 dedupe=t
This one just gets stuck forever - I don't think it's doing anything, it's just waiting.
* I was able to do what I want using
dedupe.sh from the same package, but based on a comparison on a single file, it is much much slower than clumpify.