Question: How to export subset of metadata and expression data from BioConductor GEOquery?
gravatar for William
6.5 years ago by
William4.7k wrote:

I am planning to use Bioconductor GEOquery to download a couple of micro-array datasets from NCBI GEO.

Then I would like to export a  subset of the metadata and the expression data to flat files that I can import elsewhere.

What I have so far is:


geo_id <- "GSE45016"

gse <- getGEO(geo_id,GSEMatrix=FALSE)

#show metadata


#show metadata for first sample


#select specific field from metadata of first sample


# Result for sample 1

[1] "tissue: normal prostate (NP) epithelial cells"


# Result for sample 2

[1] "tissue: prostate cancer cells"   "clinical stage: clinical T4N0M1"
[3] "gleason score: GS 9"             "psa level: PSA 5477ng/ml"

As you can see the number of key value pairs is different for sample 1 and 2.  What is would like to have is an array for every key under


and then the value or null (in case the key is missing) for every sample in the GEO dataset" ;

key_tissue: normal prostate (NP) epithelial cells\tprostate cancer cells

key_psa_level: null\tPSA 5477ng/ml

Other metadata fields like "title" luckily only have a single value beneath it.

GSMList(gse)[[1]]@header$title = "Normal prostate"

GSMList(gse)[[2]]@header$title = "High-grade PC1"

Also these I would like to have in an array for the key title.


My second question is how to export the expressions data that is stored under every sample. I would like to stream trough all the probes, get the expression values for that probe for each sample and write it to another csv file.

bioconductor R geo • 7.5k views
ADD COMMENTlink modified 5 months ago by Biostar ♦♦ 20 • written 6.5 years ago by William4.7k
gravatar for Neilfws
6.5 years ago by
Sydney, Australia
Neilfws49k wrote:

I think that the way you have chosen to read the GSE data into R has created some confusion for you.

Try this instead (note: formatting was lost here so posted as a Gist):

As for exporting the expression data:

exp <- exprs(gse)

returns a matrix where the column names are sample names.

ADD COMMENTlink modified 6.5 years ago • written 6.5 years ago by Neilfws49k

Hi Neilfws, How did you write this reply? first it is in the gitbub and second how to prepare them in gitbub? thanks.

ADD REPLYlink written 4.7 years ago by Shicheng Guo8.5k

Nicely done.

ADD REPLYlink written 6.5 years ago by Sean Davis26k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1583 users visited in the last hour