Question: Human Exon 1.0 ST Probesets with multiple gene symbols associated with them
gravatar for lcordeiro
3.8 years ago by
Rio de Janeiro / National Cancer Institute (INCA)
lcordeiro30 wrote:

Hello everyone,

I'm learning to analyse data from Human Exon arrays and found something curious, which I don't know how to handle. I searched BioStar and couldn't find anything closely related to this issue.

I've done all processing up to generating a list of "differentially expressed probe sets" (DEPS) with RMA/limma without any problems. I run RMA at the probeset level and used biomart to get the gene annotation information based on the DEPS. (I tried the getNetAffx function as well to no avail; I still didn't know which gene symbol to choose for some probesets.)

When I looked at the annotated results I noticed that more than 600 probesets annotated to more than gene symbol (or Entrez, Emsembl, it didn't matter...). I know that the converse is absolutely fine (2 or more probesets annotating to the same gene) but wasn't expecting it to be the other way around.

I then batch-searched for annotation information directly on the NetAffx website and, still, got more than 1 gene symbol for some of the probesets.

My question is: how to choose the appropriate gene symbol for a given probeset when there are multiple hits? I'm leaning towards picking the first gene symbol returned from the NetAffx query but this seemed too crude...

Perhaps a related question would be: should I forget about analyzing data at the probeset level and simply do it at the transcript cluster (gene) level instead?



ADD COMMENTlink modified 3.8 years ago by mastal5112.0k • written 3.8 years ago by lcordeiro30
gravatar for mastal511
3.8 years ago by
mastal5112.0k wrote:

I don't know that much about the exon arrays, but with the 3' arrays, some probesets were designed against regions where genes on opposite strands overlap at the 3' ends or the 5' ends, so that it's difficult to know which gene to assign the probeset to. Sometimes probesets would have been designed based on UniGene clusters, and the annotation of the UniGene clusters might have changed over the years, so that they might be associated with more than one gene. You can try aligning some of the probes in question to the genome with BLAT, and see where they align, if they align in more than one place.

ADD COMMENTlink written 3.8 years ago by mastal5112.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1676 users visited in the last hour