Variant caller comparison for non-human data (highly varied populations)
0
0
Entering edit mode
6.5 years ago
hermathena ▴ 40

Hi,

Is anyone aware of a recent comparison of various variant callers (GATK, FreeBayes, etc) for non-model organisms, please? There are many out there for human data, obviously because there are good reference sets. My data is hundreds of whole genomes from an insect species (>10% sites variable!), and we traditionally use GATK. However, the GATK HaplotypeCaller is rather slow for this data. Sensitivity is a higher concern than precision (not looking for specific SNP associations).

variant calling genotyping GATK FreeBayes Platypus • 1.5k views
ADD COMMENT
2
Entering edit mode

Just FYI, the Broad is about to release GATK4, stating that notably improvements in speed were made. Maybe it is worth trying the beta-release of GATK4 and see if it performs well for your task?

ADD REPLY
1
Entering edit mode

Thanks for this. I have experimented wth GATK v4 Beta. There are some gains in speed through multithreading. Unfortunately, there are also many bugs that crop up. Broad is recommending not using GATK4 with Spark for now. That horrible Queue parallelisation is gone, but now you need to use something called GenomicsDBImport to merge gVCFs - and that needs to operate separately on each scaffold... For the time being one may as well use GATK3. There is still the benefit of the InDel model.

ADD REPLY

Login before adding your answer.

Traffic: 2384 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6