we got a few hundred large compressed BAM files (70GB <= size <= 300GB). They are not sorted and we would like to convert them back to fastq, in order to align them with a different algorithm.
We have paired-end reads and were planning to first sort the BAM by read name using sambamba (http://lomereiter.github.io/sambamba/docs/sambamba-sort.html) and then to convert the sorted BAM into fastq using bedtools (http://bedtools.readthedocs.io/en/latest/content/tools/bamtofastq.html). However, while the sorting is relatively fast (about 4h for each file), the conversion is very slow.
Is anyone aware of any other procedure that will make the conversion faster? I've seen that there are alternatives to bedtools such as picard (https://broadinstitute.github.io/picard/command-line-overview.html#FastqToSam) and biobambam2 (https://github.com/gt1/biobambam2). Does anyone know the performances of these tools, if a benchmarking has already been performed and/or if there are better tools?
Thank you very much in advance :)