How I deal with redundant genes in microarray
15 months ago
I have been given normalized microarray data all with weird probes in rows like ADXECADA.4210_at. I then connected probes with gene symbols from annotation file from HG-U133_Plus_2. I am seeing I have a lot of repeated genes in my final file. What should I do with them? Should I ignore them? averaging their expression?

Any suggestion


15 months ago
Hi, Multiple probes for a given gene could indicate >1 exonic regions being targeted. To be sure, you could retrieve the genomic coordinate(s) for all the probes for such a gene and check it on a genome browser (like UCSC). Sometimes a given probe could be specific for a given transcript isoform, whereas some probes could be hitting multiple transcript isoforms (of the given gene).

In my view the better strategy is to not worry about summarising probes at the start of analysis. In case this is differential expression study, proceed at the probe level and once you have the list of differentially expressed probes (which should be a much smaller number), then annotate the probes to genes. The NetAffx annotation resource used to give detailed info. for probes in terms of what transcript isoform they are targeting and whether uniquely.


