8.5 years ago by
Santiago de Compostela, Spain
GATK's UnifiedGenotyper walker can deal with SNPs and Indels both at the same time. all you have to do is to use option "-glm BOTH" as described on its manual page. discovering CNVs is far more complicated, and is not usually performed without sequencing the entire genome (I say usually because you can try inferring structural variations using several samples altogether and studying their sequencing differences; some literature is coming out on this, although I can't give you any further information regarding if it works or not because we haven't tried it yet)
although you can try tunning it for a low number of variants, the VQSR module works fine when dealing with several thousands of variants per sample, such as exome or whole genome sequencing. the underlying idea is to check how the variants detected by your experiment behave in relation to a very well known source of variation (such as HapMap variants), by creating statistical behaviour models for all the variants that match that reference dataset and applying the results to the rest of the variants. that's why you need to have lots of variants, because you need to have a significant overlap between your experiment and your dataset of reference. the VariantFiltration does not work at all in the same way: it does not determine whether your variants are trustworthy or not, it just allows to filter large numbers of variants matching a certain criteria. if you can't use VQSR you may always want to try looking for tr/tv ratios or something similar.