Question: How to annotate probes to a GEO serie matrix?
gravatar for pablojosegiraudi
13 months ago by
pablojosegiraudi0 wrote:


I am working with a GEO series matrix file (ID > GSE48452, Platform GPL11532) that corresponds to HuGene-1_1st Affymetrix Human Gene 1.1 ST array. I want to have the probes with the annotations, for example: Gene Symbol in order to create a data table like the following:

                 Sample1       Sample2  Sample3 Sample4 Sampl5  
    #CLASS:CANCER   case    case    case    case    
    #CLASS:SEX  F   F   M   M   F   M   F   M
 Gene Symbol  
  Gene1           -3.06 -2.25   -1.15   -6.64   0.4        
    Gene2          -1.36    -0.67   -0.17   -0.97   -2.0
    Gene3           1.61    -0.27    0.71        -0.62  0.14        
    Gene4           0.93    1.29           -0.23          -0.74          -2

How can map the probes with the gene symbol mantaining the order?

I was using the following Rscript without sucess, I don't know how to proceed........

getGEOdataObjects <- function(x, getGSEobject=FALSE){
  # Make sure the GEOquery package is installed
  # Use the getGEO() function to download the GEO data for the id stored in x
  GSEDATA <- getGEO(x, GSEMatrix=T, AnnotGPL=FALSE)
  # Inspect the object by printing a summary of the expression values for the first 2 columns
  print(summary(exprs(GSEDATA[[1]])[, 1:2]))

  # Get the eset object
  eset <- GSEDATA[[1]]
  # Save the objects generated for future use in the current working directory
  save(GSEDATA, eset, file=paste(x, ".RData", sep=""))

  # check whether we want to return the list object we downloaded on GEO or
  # just the eset object with the getGSEobject argument
  if(getGSEobject) return(GSEDATA) else return(eset)
# Store the dataset ids in a vector GEO_DATASETS just in case you want to loop through several GEO ids
GEO_DATASETS <- c("GSE48452")

# Use the function we created to return the eset object
eset <- getGEOdataObjects(GEO_DATASETS[1])
# Inspect the eset object to get the annotation GPL id
# Get the annotation GPL id (see Annotation: GPL10558)
gpl <- getGEO('GPL11532', destdir=".")

# Inspect the table of the gpl annotation object

# Get the gene symbol and entrez ids to be used for annotations
Table(gpl)[1:10, c(1, 2, 6, 12)]

# Get the gene expression data for all the probes with a gene symbol
geneProbes <- which(!$Symbol))
probeids <- as.character(Table(gpl)$ID[geneProbes])

probes <- intersect(probeids, rownames(exprs(eset)))

geneMatrix <- exprs(eset)[probes, ]

inds <- which(Table(gpl)$ID %in% probes)
# Check you get the same probes

# Create the expression matrix with gene ids
geneMatTable <- cbind(geneMatrix, Table(gpl)[inds, c(1, 2, 6, 12)])

# Save a copy of the expression matrix as a csv file
write.csv(geneMatTable, paste(GEO_DATASETS[1], "_DataMatrix.csv", sep=""), row.names=T)

Thank you in advance for your help!!!!

microarrays R • 827 views
ADD COMMENTlink modified 13 months ago by MatthewP850 • written 13 months ago by pablojosegiraudi0

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.

Thank you!

ADD REPLYlink written 13 months ago by GenoMax95k
gravatar for MatthewP
13 months ago by
MatthewP850 wrote:

I recommend use left_join function from R package dplyr.

ADD COMMENTlink written 13 months ago by MatthewP850
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2353 users visited in the last hour