Unknown genotypes (.) in VCF, but have supporting reads?
0
1
Entering edit mode
2.5 years ago
ben83 ▴ 50

In a VCF created by HaplotypeCaller, with reads from two haploid samples, I have some entries in which one sample has a mutation but the other doesn't, where as expected I see a 1 for one sample and a 0 for the other sample, and indeed I do (last two columns only shown):

enter image description here

My understanding is that in the above example, the first sample has 3 reads agreeing with the reference and 0 alternate reads, while the second sample has 7 agreeing with the reference and 13 with the alternate allele (yes this looks like of heterozygote-ish and I said these were haploids but let's ignore that for now).

Now sometimes there are no reads in one of the samples and in these cases it appears that the genotype is encoded by . instead of 0 or 1, for example: enter image description here

My understanding is the 0,0 there means no reads at all so a . for unknown makes sense. All well and good so far.

But then I see some lines where the genotype is encoded as a . but there are reads! For example: enter image description here

What the heck is going on here? There are 150 reads supporting the reference allele, but it doesn't call a genotype? I don't get it.

VCF haplotypecaller • 548 views
ADD COMMENT
0
Entering edit mode

There are 150 reads supporting the reference allele, but it doesn't call a genotype? I don't get it.

have a look at thise position in IGV, look at the mapping qualities, look at the sam flags, look at the 'clipping' state i that region, etc...

ADD REPLY

Login before adding your answer.

Traffic: 1795 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6