Entering edit mode
8.6 years ago
Jeremy Leipzig
22k
Is it necessary to run every sequence in 1000 genomes through GATK haplotype caller as a control for a typical exome experiment?
Let's suppose some variants of interest that are recurrent in a study are simply not in the 1kg VCFs. Is it typical to re-run all sequences from scratch to generate gVCFs which would have the necessary coverage to distinguish those positions that are reference from those that are no-calls? Can some aggregate frequencies from ExAC serve the same purpose?
Do you care about genotypes?
i guess it boils down to they don't want to make the assumption that variants not in the 1000G VCF are all reference calls. For a burden test I guess they need trusted counts of reference calls?