I have genotype data obtained through an Infinium array. The genotype is either AA, AB, BB. I would like to find the SNP (i.e. A to C, T to G, etc.) at each site for each sample. I have annotation data for each SNP site giving the A and B alleles. Therefore, I'm thinking of converting this genotype data into SNP calls following these simple rules:
- If genotype is AA, don't report anything
- If genotype is AB or BB, report a SNP from A to B
My question is does this algorithm make sense, biologically?