Entering edit mode
                    7.4 years ago
        QVINTVS_FABIVS_MAXIMVS
        
    
        ★
    
    2.6k
    Say I have SNP INDEL calls for 1000 individuals. These 1000 samples were joint-called and recalibrated with GATK in 10 batches.
As a result I have 10 VCF files with SNP and INDEL calls that I would like to merge. I only have access to the VCF files, so re-calling from the BAMs is not an option.
I'm familiar with bcftools but I'm unclear on the best way forward. 
- Should I split multiallelic entries into biallelic before merging? 
- If I'm interested in rare variants, should I omit multiallelic variants? 
- Should I left align before merging? After merging? Or Both? 
Thank you for any advice
I think your best bet would be to use GATK's merge SNPs option as I had issues merging them using vcftools/bcftools in the past. Possibly this: https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_variantutils_CombineVariants.php
Yes, I believe GATK's CombineVariants is the answer.