Question: Bioconductor getGEO(), how to convert a single eset of GSE matrix files to a list of esets?
gravatar for Davide Chicco
2.1 years ago by
Davide Chicco110
Davide Chicco110 wrote:

I've been using the getGEO() function of Bioconductor. I noticed that, instead of downloading the GEO archive every time you run the script, you can set a filename parameter where the script will read in the GEO file.

My original code:

gset <- getGEO("GSE59867", GSEMatrix =TRUE, getGPL=FALSE)

My new code:

GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
gsetFromFile <- getGEO("GSE59867",   filename=GSE59867_filename)

I tried to do that, but my script does not work anymore in the subsequent steps. I checked the getGEO() online webpage, and I read:

Note that since a single file is being parsed, the return value is not a list of esets, but a single eset when GSE matrix files are parsed.

Okay, here's the diagnosis, now I need the cure. How can I convert my my single eset of GSE matrix files to a list of esets?


EDIT: Sorry if I was unclear: the above piece of code works, but I have a problem later when I use the function getBM(). Here's the complete piece of code:

GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
gset <- getGEO("GSE59867",  GSEMatrix =FALSE,   filename=GSE59867_filename)

if (length(gset) > 1) idx <- grep("GPL570", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]

mart <- useMart("ENSEMBL_MART_ENSEMBL")
mart <- useDataset("hsapiens_gene_ensembl", mart)
annotLookup <- getBM(mart=mart, attributes=c("affy_hugene_1_0_st_v1", "ensembl_gene_id", "gene_biotype", "external_gene_name"), filter="affy_hugene_1_0_st_v1", values=rownames(exprs(gset))[1:50], uniqueRows=TRUE)

And here's the error I get:

Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘exprs’ for signature ‘"character"’ Calls: getBM -> rownames -> exprs -> <anonymous> Execution halted

Any idea on how to solve it? Thanks!

getgeo bioconductor R • 1.3k views
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Davide Chicco110
gravatar for Benn
2.1 years ago by
Benn8.0k wrote:

Your code works for me:

> GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
> gsetFromFile <- getGEO("GSE59867",   filename=GSE59867_filename)

> gsetFromFile
ExpressionSet (storageMode: lockedEnvironment)
assayData: 33297 features, 436 samples 
  element names: exprs 
protocolData: none
  sampleNames: GSM1448335 GSM1448336 ... GSM1620804 (436 total)
  varLabels: title geo_accession ... samples collection:ch1 (34 total)
  varMetadata: labelDescription
  featureNames: 7892501 7892502 ... 8180418 (33297 total)
  fvarLabels: ID GB_LIST ... category (12 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL6244 

> dim(exprs(gsetFromFile))
[1] 33297   436

I get an expression set of 436 samples, I don't understand why you want a list of esets? Please explain why you expect a list of esets? Please explain what your subsequent steps are.

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Benn8.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1953 users visited in the last hour