I am currently setting up a variant calling pipeline for WGS using GATK4. As part of this, Base quality score recalibration is essential. Since GATK4 there is the option to parallelize this with
spark. Using this option leads to the Warning
Warning: ApplyBQSRSpark is a BETA tool and is not yet ready for use in production
On their homepage it is also specified that
ApplyBQSRSpark is a BETA version.
During my tests however, I did not notice a difference between the "stable" single core
ApplyBQSR with regard to the results. However, I didn't tried many BAM files yet. Because the speed up with
ApplyBQSRSpark is enormous, I would really like to use that.
I was wondering if someone of you has experience with
ApplyBQSRSpark and whether it outputs the same variants in the end.