Question: Resolving Redundant Affymetrix Probe Annotations
4.1 years ago
United States
wrote:

I am trying to explore the differential gene expression data from a paper that used the Affymetrix mouse gene ST 2.1 microarrays to perform gene expression analysis. In the paper they don't actually explore the gene expression data (only plots showing the number of +/- genes across conditions) and in their supplemental they only list the probe id, log ratio and adjusted p-value.

The problem I am having is that when I try and run any pathway/functional analysis (Ingenuity Pathway Analysis) tools most of my probes get dropped out since they have multiple annotations. In the paper they solved this issue by doing RNA-Seq and making a map between the mouse genome, their contigs and the array probes but this data isn't published and isn't in SRA and I'd be willing to bet they're not so keen on sharing it.

The first choice is to simply take each protein/gene for the given probe and assign them the FC and p-value for that probe, but that's going to confound any enrichment if those genes all have terms. I could pick the 'canonical' isoform/annotation for that probe, that is drop out any short/alternate gene products keeping, but that will add noise into the analysis by effectively creating false positives and negatives.

I can't imagine this isn't already handled since it seems to be the standard for Affy microarrays but I'm not finding much outside probe databases/alternative annotations which don't really solve that issue.

It's worth at least trying to see if the authors will share that gene->probe mapping file. You never know, you might get lucky.

I am but cash politics rules everything around me.

