Question: Quality control of Illumina genotyping microarrays
Hi. This question is about Illumina microarrays, and how to interpret results.

As you know, the final report with data is generated by GenomeStudio and provides intensity for both alleles of SNP, these values are known as X (intensity for allele A) and Y (intensity for allele B). So, as we understand the idea, if X >> Y, the genotype is AA, if X ~ Y, the genotype is AB and if X << Y genotype is BB.

Do we understand the idea overall correctly?

So, if we have a look at this in cartesian coordinates, the AA cluster lies around 90 degrees, the AB around 45 and BB around 0. For example, on figure 1 below, the BB cluster lies reflects the expectations, but AB cluster doesn’t (around 30 degrees or so). For sample in AB cluster, we see that the value of intensity on X channel is 2 times greater than the intensity value for Y channel. In papers (example, some researchers do manual reclustering and after that clusters may be in different positions (like 65 for AA, 40 for AB, 10 for BB) and these samples are considered as ok.


Another example (Figure 2): Samples, which were assessed as BB, have less than 2x difference of intensity value for X and Y channels, moreover the sample, which lies exactly on 45 degree angle is excluded.


Questions: Should we take into account clusters\samples which are not in ~0\~45~\~90 angle? If yes, how can we distinguish between genotypes (see Figure 2)? And how should expected angle be settled if we have only small number of samples (so only one cluster is present)? If not, what criteria would you suggest for excluding samples in general?

Looking forward for your answers, cheers.

