Question: Mapping probe ids in microarray data to gene ids
0
gravatar for Natasha
16 months ago by
Natasha40
Natasha40 wrote:

I've performed RMA normalization of intensity data in raw files of dataset GSE1133. The output obtained after normalization is in the following format

                            GSM18584.CEL GSM18585.CEL GSM18586.CEL GSM18587.CEL
AFFX-18SRNAMur/X00686_3_at     10.324639    10.309749     7.978267     7.784038
AFFX-18SRNAMur/X00686_5_at      9.080051     9.401111     5.540294     5.539700

The data is from platform https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL1073

I would like to map the probe ids to gene ids. I had a look at the table presented in the above link.

The table header presents the following ids

Data table header descriptions
ID  Probe Set Name
Identifier_Source   Identifier_Source
Description Description
CLONE_ID    clone identifier
Sequence_Type   Sequence Type
SEQUENCE    
SPOT_ID Column added by GEO staff to facilitate sequence tracking in Entrez GEO
GB_ACC  GenBank Accession Number

I also downloaded the complete file , I could find gene names but I am not able to find mappings like Entrez gene ids. I also read that http://genome.ucsc.edu browser can be used. But I am not sure which tool has to be used from the genome browser.

Could someone suggest how to proceed?

microarray gene • 930 views
ADD COMMENTlink modified 16 months ago by Manoj180 • written 16 months ago by Natasha40
0
gravatar for Manoj
16 months ago by
Manoj180
India
Manoj180 wrote:

You can easily do this in R.. This is an example of human.

library(hgu133plus2.db)

library(annotate)

library(limma)

probeset.list <- read.table("data.txt")

gene.symbols <- getSYMBOL(rownames(probeset.list),"hgu133plus2.db")

results <- cbind(probeset.list, gene.symbols)

print(head(results))

Hope this help

ADD COMMENTlink modified 16 months ago by ATpoint45k • written 16 months ago by Manoj180
2

This will not work, in this case, because the samples in which the user is interested are not from the U133 chip - they are from what seems to be a customised chip called 'GNF1M' (GPL1073).

Natasha, the easiest way is probably to download the 'Annotation SOFT table...' from HERE, read that into R, and then match up this annotation data with your expression matrix. Gene symbols are in column 3 of this annotation file.

ADD REPLYlink written 16 months ago by Kevin Blighe70k
1

Glad to know the answer :)

ADD REPLYlink written 16 months ago by Manoj180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1663 users visited in the last hour
_