Hi all, I am currently learning a pipeline for quality control of Affymetrix GeneChip data, and one of the steps in quality control of the probesets involves excluding any probesets that probes for genes not on autosomes, that is chromosome X, Y and such. Why is it that we do not want these probesets?
Non-autosomal chromosomes are excluded from many types of analyses (not just this type of microarray QC) due to the expected difference in copy number both with other chromosomes and other samples due to gender. Generally during normalization we want to make signal distributions as comparable as possible and including these would confound that. Typically these chromosomes are included later on in the analysis (though I've seen many people excluding them for variant calling, also for related reasons).