Question: Gatk'S Unifiedgenotyper And Genotype Determination
0
gravatar for Dan Gaston
5.3 years ago by
Dan Gaston6.9k
Canada
Dan Gaston6.9k wrote:

I'm using the GATK UnifiedGenotyper for calling variants in my exome sequences and for my total pipeline am basically following the Broad's recommended "best practices" as well as the recommended details for exome analysis pipeline that have been posted here and over on SeqAnswers. Looking at some of my variants I am finding some of the Genotype calls a little odd. For instance some variants called as homozygous where there were nearly equal numbers of reads for both the reference and alternative allele and the DP read depth number looks like most of the total reads were kept. (ie. 158 Ref, 123 Alt, DP of 250). Looking at the Phred-Scaled likelihoods the Homozygous alt is of course 0 and heterozygous is at 15, so there is considerable ambiguity of the call going by the likelihoods. I am just wondering if anyone has insight in cases like this why it is likely to have favoured the homozygous alt call.

exome gatk snp genotyping • 2.3k views
ADD COMMENTlink written 5.3 years ago by Dan Gaston6.9k
0
gravatar for matted
5.3 years ago by
matted6.5k
Boston, United States
matted6.5k wrote:

Check out their FAQ "Why didn't the Unified Genotyper call my SNP? I can see it right there in IGV!".

Without knowing too much about your specific case, I'd consider their particular suggestions of thinking "What do the mapping qualities look like for the reads with the non-reference bases?" and "What do the base qualities look like for the non-reference bases?".

ADD COMMENTlink written 5.3 years ago by matted6.5k

It isn't that the SNP isn't called, the SNP is called, it just seems as if the Likelihoods for the genotypes aren't splitting the way I would expect. I am guessing it is a combination of base quality/mapping quality; however in both cases neither was below a threshold for filtering. Only about 20 reads total were filtered, and there aren't a high number of alternate alleles or spanning deletions.

My best guess would be base quality with a very slight bias towards reference alleles. You can see in the Phred-scaled likelihoods that there is quite a bit of uncertainty with a nearly equal split between the homo and heterozygous call.

ADD REPLYlink written 5.3 years ago by Dan Gaston6.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1007 users visited in the last hour