Question: Is GATK overestimating the heterozygous calls?
gravatar for biobudhan
3.6 years ago by
biobudhan20 wrote:

I have 24 genotypes distributed in 4 different populations.

I used HaplotypeCaller with the option –ERC –GVCF and obtained the vcf file for each genotype. Then combined all the genotypes to a single vcf file with GenotypeGVCFs option.

Is there a way to tell GATK to label a variant site as „Heterozygous“ only if it is present in >60% of the reads?

Example: At position 82 (highlighted with a red box in the figure), the genotype field for this variant is 0/1. Whereas, as seen from the IGV, only 3 of the 10 reads contain an alternate allele „A“. Which filter should I use in the HaplotypeCaller or GenotypeGVCF or VariantFiltration to label a variant site as heterozygous if it’s present in say, for example 6 out of 10 reads.

Example figure can be found here:

snp heterozygous gatk vcf • 1.4k views
ADD COMMENTlink modified 3.6 years ago by andrew.j.skelton736.1k • written 3.6 years ago by biobudhan20
gravatar for andrew.j.skelton73
3.6 years ago by
andrew.j.skelton736.1k wrote:

IGV can be misleading in this case. There are two things to consider; optical / PCR duplicates, and assembled reads.

1: Including duplicate reads when looking at IGV can cause headaches as duplicate reads aren't considered for analysis.

2: When GATK is running GenotypeGVCFs in Joint Calling, it's doing so based on the Haplotype Caller results. The Haplotype Caller is an assembler, and as such takes your alignment as input, then assembles windows, meaning that the results will likely be different from the initial alignment you passed to the Haplotype Caller.

You can get the Haplotype Caller to output a BAM file based on it's assembly using the --bamOutput argument.

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by andrew.j.skelton736.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1477 users visited in the last hour