Question: Bioconductor getGEO(), how to convert a single eset of GSE matrix files to a list of esets?
gravatar for Davide Chicco
4 months ago by
Davide Chicco80 wrote:

I've been using the getGEO() function of Bioconductor. I noticed that, instead of downloading the GEO archive every time you run the script, you can set a filename parameter where the script will read in the GEO file.

My original code:

gset <- getGEO("GSE59867", GSEMatrix =TRUE, getGPL=FALSE)

My new code:

GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
gsetFromFile <- getGEO("GSE59867",   filename=GSE59867_filename)

I tried to do that, but my script does not work anymore in the subsequent steps. I checked the getGEO() online webpage, and I read:

Note that since a single file is being parsed, the return value is not a list of esets, but a single eset when GSE matrix files are parsed.

Okay, here's the diagnosis, now I need the cure. How can I convert my my single eset of GSE matrix files to a list of esets?


EDIT: Sorry if I was unclear: the above piece of code works, but I have a problem later when I use the function getBM(). Here's the complete piece of code:

GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
gset <- getGEO("GSE59867",  GSEMatrix =FALSE,   filename=GSE59867_filename)

if (length(gset) > 1) idx <- grep("GPL570", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]

mart <- useMart("ENSEMBL_MART_ENSEMBL")
mart <- useDataset("hsapiens_gene_ensembl", mart)
annotLookup <- getBM(mart=mart, attributes=c("affy_hugene_1_0_st_v1", "ensembl_gene_id", "gene_biotype", "external_gene_name"), filter="affy_hugene_1_0_st_v1", values=rownames(exprs(gset))[1:50], uniqueRows=TRUE)

And here's the error I get:

Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘exprs’ for signature ‘"character"’ Calls: getBM -> rownames -> exprs -> <anonymous> Execution halted

Any idea on how to solve it? Thanks!

getgeo bioconductor R • 293 views
ADD COMMENTlink modified 4 months ago • written 4 months ago by Davide Chicco80
gravatar for b.nota
4 months ago by
b.nota6.2k wrote:

Your code works for me:

> GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
> gsetFromFile <- getGEO("GSE59867",   filename=GSE59867_filename)

> gsetFromFile
ExpressionSet (storageMode: lockedEnvironment)
assayData: 33297 features, 436 samples 
  element names: exprs 
protocolData: none
  sampleNames: GSM1448335 GSM1448336 ... GSM1620804 (436 total)
  varLabels: title geo_accession ... samples collection:ch1 (34 total)
  varMetadata: labelDescription
  featureNames: 7892501 7892502 ... 8180418 (33297 total)
  fvarLabels: ID GB_LIST ... category (12 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL6244 

> dim(exprs(gsetFromFile))
[1] 33297   436

I get an expression set of 436 samples, I don't understand why you want a list of esets? Please explain why you expect a list of esets? Please explain what your subsequent steps are.

ADD COMMENTlink modified 4 months ago • written 4 months ago by b.nota6.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2156 users visited in the last hour