How to interpret reference homozygous genotype in the VCF file generated by GATK
8.2 years ago

Hi,

In the VCF files I have, the header says that output_mode=EMIT_VARIANTS_ONLY. In this case, I see a lot of 0/1 and 1/1 genotypes in the files. However, I also see a few 0/0 genotypes and most of such 0/0 have very low qualities. I wonder how to interpret the 0/0 genotypes in the files?

8.2 years ago
Dan D 7.3k

A 0/0 genotype indicates that the given sample in the VCF file record is homozygous to the reference. In other words, it's not a variant with respect to the reference sequence used. The low quality indicates that this 0/0 call wasn't made with a high degree of confidence.

Are there multiple samples in your VCF file?

These VCF files are individual VCF files as one file corresponds to one subject. You know, the output_mode=EMIT_VARIANTS_ONLY, so I expect to see none of 0/0 genotype. Could you tell me why I still see few 0/0 genotypes in such VCF files?

I see. Now I better understand your question.

Do you see a pattern with the "culprit" descriptor on these 0/0 "variants"?

So does that mean 0/0 is not an snp? Then why is it reported?