We are happy to announce the release of elPrep 4.0.0, an open-source, drop-in replacement tool for GATK4/Picard/SAMtools for preparing SAM/BAM files for variant calling that produces identical results, while greatly improving computational performance. For more details, see the elprep github repository.
elPrep 4.0.0 introduces multiple new features allowing us to process the preparation steps defined by the GATK Best Practices for variant calling.
New features include:
- added base quality score recalibration (BQSR)
- added optical duplicate marking
- added metrics (MultiQC compatible)
- support for SAM File Format version 1.6
- support for FASTA and VCF files
- support for elPrep-specific elsites and elfasta formats
- split/filter/merge (sfm) mode now implemented in Go instead of Python
- added --log-path option to all tools
- various API and performance improvements
- changed license to the GNU Affero General Public License version 3 as published by the Free Software Foundation, with Additional Terms
- updated demos
Our benchmarks show that elPrep 4.0.0 executes the sort/deduplicate/recalibrate and apply-BQSR-pipeline from the GATK Best Practices up to 12x faster for WES data and 7.5x faster for WGS data, while utilising similar or fewer compute resources than Picard/GATK4.
Example runtime, RAM use, and disk use for 50x WGS Illumina Platinum Genome NA12878 aligned against hg38. elPrep combines the execution of the 4 pipeline steps for efficient parallel execution.
We are looking forward to your feedback and suggestions.
Thanks a lot!
Charlotte Herzeel, Exascience Life Lab, Imec, Belgium