I'm using the GATK UnifiedGenotyper for calling variants in my exome sequences and for my total pipeline am basically following the Broad's recommended "best practices" as well as the recommended details for exome analysis pipeline that have been posted here and over on SeqAnswers. Looking at some of my variants I am finding some of the Genotype calls a little odd. For instance some variants called as homozygous where there were nearly equal numbers of reads for both the reference and alternative allele and the DP read depth number looks like most of the total reads were kept. (ie. 158 Ref, 123 Alt, DP of 250). Looking at the Phred-Scaled likelihoods the Homozygous alt is of course 0 and heterozygous is at 15, so there is considerable ambiguity of the call going by the likelihoods. I am just wondering if anyone has insight in cases like this why it is likely to have favoured the homozygous alt call.
Check out their FAQ "Why didn't the Unified Genotyper call my SNP? I can see it right there in IGV!".
Without knowing too much about your specific case, I'd consider their particular suggestions of thinking "What do the mapping qualities look like for the reads with the non-reference bases?" and "What do the base qualities look like for the non-reference bases?".