Question: Bioconductor getGEO(), how to convert a single eset of GSE matrix files to a list of esets?
0
gravatar for Davide Chicco
4 months ago by
Canada
Davide Chicco80 wrote:

I've been using the getGEO() function of Bioconductor. I noticed that, instead of downloading the GEO archive every time you run the script, you can set a filename parameter where the script will read in the GEO file.

My original code:

gset <- getGEO("GSE59867", GSEMatrix =TRUE, getGPL=FALSE)

My new code:

GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
gsetFromFile <- getGEO("GSE59867",   filename=GSE59867_filename)

I tried to do that, but my script does not work anymore in the subsequent steps. I checked the getGEO() online webpage, and I read:

Note that since a single file is being parsed, the return value is not a list of esets, but a single eset when GSE matrix files are parsed.

Okay, here's the diagnosis, now I need the cure. How can I convert my my single eset of GSE matrix files to a list of esets?

Thanks!

EDIT: Sorry if I was unclear: the above piece of code works, but I have a problem later when I use the function getBM(). Here's the complete piece of code:

GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
gset <- getGEO("GSE59867",  GSEMatrix =FALSE,   filename=GSE59867_filename)

if (length(gset) > 1) idx <- grep("GPL570", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]


mart <- useMart("ENSEMBL_MART_ENSEMBL")
mart <- useDataset("hsapiens_gene_ensembl", mart)
annotLookup <- getBM(mart=mart, attributes=c("affy_hugene_1_0_st_v1", "ensembl_gene_id", "gene_biotype", "external_gene_name"), filter="affy_hugene_1_0_st_v1", values=rownames(exprs(gset))[1:50], uniqueRows=TRUE)

And here's the error I get:

Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘exprs’ for signature ‘"character"’ Calls: getBM -> rownames -> exprs -> <anonymous> Execution halted

Any idea on how to solve it? Thanks!

getgeo bioconductor R • 293 views
ADD COMMENTlink modified 4 months ago • written 4 months ago by Davide Chicco80
0
gravatar for b.nota
4 months ago by
b.nota6.2k
Netherlands
b.nota6.2k wrote:

Your code works for me:

> GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
> gsetFromFile <- getGEO("GSE59867",   filename=GSE59867_filename)

> gsetFromFile
ExpressionSet (storageMode: lockedEnvironment)
assayData: 33297 features, 436 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: GSM1448335 GSM1448336 ... GSM1620804 (436 total)
  varLabels: title geo_accession ... samples collection:ch1 (34 total)
  varMetadata: labelDescription
featureData
  featureNames: 7892501 7892502 ... 8180418 (33297 total)
  fvarLabels: ID GB_LIST ... category (12 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL6244 

> dim(exprs(gsetFromFile))
[1] 33297   436

I get an expression set of 436 samples, I don't understand why you want a list of esets? Please explain why you expect a list of esets? Please explain what your subsequent steps are.

ADD COMMENTlink modified 4 months ago • written 4 months ago by b.nota6.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2156 users visited in the last hour