We have WGS of 8 samples from same parasite species but two phenotypes (Suggested to be two sub-strains). I have planned on the following approach to get a high quality variant set since no known variants available to this species.
1) Call variants for each sample using 3 callers (GATK, Smatools, FreeBayes) 2) Combine 3 vcf files from 3 callers for a single sample to get a single vcf per sample & filter based on population statistics & population consensus per site 3) Combine sample vcf files to get a one vcf file 4) Filter to get a final high quality vcf for downstream analysis
Is this a valid approach to get a final high quality vcf?
Or can I generate multi-sample vcfs from each caller (e.g. GATK HaplotypeCaller in GVCF mode) and combine them to get a final vcf?
Appreciate your valuable comments on this.
Thank you. Regards Rangi