when we have multiple alternative alleles for a single site ,then GT will be given 0/1 or 1/1 or 0/2 or 2/2 or 1/2 etc.. what actually this 1/2 means?

``````0/1 -> first alternate allele which is in heterozygous
1/1 -> First alternate allele which is in homozygous
0/2 -> Second ALT allele which is in heterozygous
2/2 -> Second ALT allele which is in homozygous
1/2 -> ?
``````

for eg: chr1 36214421 . CTTTTTTTTTTTT CT,C,CTT,CTTT 5599.73

I have four alt alleles here and for each of these alt allele I need to get homozygous and heterozygous count across multiple samples as below

``````0/0:12,0,0,0,0:12:0:.:.:0,0,136,0,136,136,0,136,136,136,0,136,136,136,136       0/0:18,0,0,0,0:18:0:.:.:0,0,189,0,189,189,0,189,189,189,0,189,189,189,189    ./.:6,0,0,0,0:6:.:.:.:0,0,0,0,0,0,0,0,0,0,0,0,0,0,0     1/2:0,5,1,0,0:6:8:.:.:252,26,8,128,0,110,154,23,113,137,154,23,113,137,137      0/0:20,0,0,0,0:20:54:.:.:0,54,316,54,316,316,54,316,316,316,54,316,316,316,316       1/3:0,2,0,1,0:3:16:.:.:126,25,16,61,22,52,36,0,30,27,61,22,52,30,52     0/0:12,0,0,0,0:12:0:.:.:0,0,158,0,158,158,0,158,158,158,0,158,158,158,158    0/0:12,0,0,0,0:12:0:.:.:0,0,218,0,218,218,0,218,218,218,0,218,218,218,218       0/0:12,0,0,0,0:12:0:.:.:0,0,186,0,186,186,0,186,186,186,0,186,186,186,186
``````

For first allele I was counting 1/1 and 0/1 as homo and hetero For second 2/2 and 0/2 For third 3/3 and 0/3

But I am confused with 1/2 and 2/3 and all and which category I will include this and for which allele ?

An individual with 1/2 or 1/3 has no reference allele, en in your example a 1/2 would correspond to CT on one allele and C on the other allele. Therefore, for this variant these individuals are neither homozygous or heterozygous. It's a more complex case.

Thanks for the reply .. Then why can't we consider the variant CT -> 1/2 as heterozygous? These scenarios are biologically relevant ?

Heterozygous is not something that you can say about one variant: it's about the combination of both alleles together. And heterozygous and homozygous are indeed biologically relevant, but the reference sequence is just a sequence. In your example, the ref is CTTTTTTTTTTTT. But if the reference individual (actually: multiple individuals) would have been CTT on this position then your variant call would be different as well!

An individual with CT on one allele and C on the other allele is heterozygous, if you consider one of his chromosomes as the reference. So yes his alleles are different = hetero. But more specifically, in comparison to a reference sequence with another alelle, he is neither.

Ok..Probably I can discard these genotypes while taking the count.

Homozygous vs heterozygous has nothing to do with the reference, it is simply whether an individual has the same allele vs different alleles on both homologous chromosomes. See Zygosity

Thus a GT of 1/2 is heterozygous, similarly for a GT of 2/3.

You are definitely right about zygosity, and reading it again my answer might be confusing, too. But in my defence, variants are always considered relative to the reference genome. In the example of OP, you can't simplify it to saying that 67* individuals are heterozygous for that variant since they all have different genotypes. Medically/biologically only one of those alternative alleles might be relevant, and the rest are heterozygous for another alternative allele. I probably didn't offer OP a solution to this question, and your approach is more straightforward, but I want to stress that in larger cohorts those inconvenient positions shouldn't be oversimplified to homozygous and heterozygous carriers.

