Hi, I have a gvcf file produced from GATK. A lot of the sites in the vcf file have "NON_REF" in the alt allele column. It is a multi-sample joint genotyped vcf, so I can see that at some of the sites with NON_REF for alt allele, some of them samples have a 0/0 called genotype. I wanted to know what do this NON_REF or non variant sites actually refer to, are these sites which are homozygous for the ref allele? And if so then why are they in a vcf file?
Cheers
Thanks Jeremy, I am clearer on this now. One more question, does this mean all the sites where the reads from my sequenced genome mapped to the reference but the alleles they contained were the same as the ref are outputted to the gVCF as NON_REF?
yes the <NON_REF> by itself means a span of homozygous ref