Hello everybody!
I just started working with data from the Affymetrix GeneChip Mouse Gene 1.0 ST Array and I have two questions about that and hope that somebody can help me.
1.The first is a more general and probably very easy question but I was wondering which IDs I should use to map Affymetrix probeset_ids or transcript_cluster_ids to genes. I found here a lot of questions and very good answers about that several probeset_ids/transcript_cluster_ids are matched with the same gene etc. I don't have a problem using different R packages (biomaRt, xmapcore, mogene10sttranscriptcluster.db etc.) to match the IDs from Affymetrix to Gene Symbol, Ensembl, Entrez Gene, Unigene IDs etc. But my question is which of these IDs I should use to determine that two (or more) transcript_cluster_ids are matched to the same gene? In other words, what ID "type" is the standard to say that two probesets are assigned to the same gene?
I guess for most of the genes it shouldn't make a difference which ID type I use but for some probesets the annotation is different and the probesets with missing annotations are different for the different types. I saw that others used the Gene symbols (what I would have used) but I also saw the use of Unigene IDs...
I am using R and used ReadAffy to get an AffyBatch object from the cel files that I got from GEO. On GEO the platform is described as [MoGene-1_0-st] Affymetrix Mouse Gene 1.0 ST Array [transcript (gene) version].
From Affymetrix I downloaded two annotation files:
MoGene-1_0-st-v1.na30.mm9.transcript.csvandMoGene-1_0-st-v1.na30.mm9.probeset.csvI now wanted to match the data in my AffyBatch object to the
probeset_ids in the probeset annotation file. When I tryprobeName(AffyBatch), I get thetranscript_cluster_ids for each row in the intensity matrix (these are equal to theprobeset_ids in the transcript annotation file) but not theprobeset_ids from the probeset annotation file.Is the information about the
probeset_ids from the probeset annotation file not stored in my AffyBatch object because the cel files are from a "transcript (gene) version" or what do I do wrong?
Thank you very much for your help!
Sandra
Hi,
I'm new in microarrays analysis and I have similar troubles with HuGene-1_0-st-v1 .CEL files and I wanna use it to do some Gene analysis and Pathways analysis. My file has the next probeset_id, and similar transcript_cluster to:
and I always see something like that:
is it because the HuGenes have this
probeset_id? or exist some way to get the secondprobeset_idwith suffixes? because I have a big problem understanding the id's: probeset, transcript with genes or exons.Thanks in advance!!