I have 20 bunzip2 fastq files. Each compressed fastq file is ~4gb and reads are 50bp long. I want to trim reads to 36bp in compressed files.
I have tried bioawk but it does not accept bz2 files.
I tried FASTX-toolkit, but it also does not accept bz2 files but then I tried -
bzcat input.fastq.bz2 | fastx_trimmer -l 36 -i - | gzip > trimmed.fastq.gz
The reason I have used gzip (or may be pigz for making it more fast) here because my next step is mapping using BWA and it does not accept bz2 files but do accept gz files.
The above code works but for each file it is taking around 30 minutes. If I don't use gzip in above code then it takes about 22 minutes for each file but then files have large size. For 20 files, it is going to take lot of time and in future I will be receiving 40-45 files like this.
Can anyone please suggest me an alternative way which is efficient and not time consuming?