Question: How to export subset of metadata and expression data from BioConductor GEOquery?
3
gravatar for William
4.4 years ago by
William4.4k
Europe
William4.4k wrote:

I am planning to use Bioconductor GEOquery to download a couple of micro-array datasets from NCBI GEO.

http://www.bioconductor.org/packages/release/bioc/html/GEOquery.html

Then I would like to export a  subset of the metadata and the expression data to flat files that I can import elsewhere.

What I have so far is:

library(GEOquery)
library("R.utils")

geo_id <- "GSE45016"

gse <- getGEO(geo_id,GSEMatrix=FALSE)

#show metadata

Meta(gse)

#show metadata for first sample

GSMList(gse)[[1]]

#select specific field from metadata of first sample

GSMList(gse)[[1]]@header$characteristics_ch1

# Result for sample 1

[1] "tissue: normal prostate (NP) epithelial cells"

GSMList(gse)[[2]]@header$characteristics_ch1

# Result for sample 2

[1] "tissue: prostate cancer cells"   "clinical stage: clinical T4N0M1"
[3] "gleason score: GS 9"             "psa level: PSA 5477ng/ml"

As you can see the number of key value pairs is different for sample 1 and 2.  What is would like to have is an array for every key under

@header$characteristics_ch1 

and then the value or null (in case the key is missing) for every sample in the GEO dataset" ;

key_tissue: normal prostate (NP) epithelial cells\tprostate cancer cells

key_psa_level: null\tPSA 5477ng/ml

Other metadata fields like "title" luckily only have a single value beneath it.

GSMList(gse)[[1]]@header$title = "Normal prostate"

GSMList(gse)[[2]]@header$title = "High-grade PC1"

Also these I would like to have in an array for the key title.

 

My second question is how to export the expressions data that is stored under every sample. I would like to stream trough all the probes, get the expression values for that probe for each sample and write it to another csv file.

bioconductor R geo • 5.8k views
ADD COMMENTlink modified 4.4 years ago by Neilfws48k • written 4.4 years ago by William4.4k
12
gravatar for Neilfws
4.4 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

I think that the way you have chosen to read the GSE data into R has created some confusion for you.

Try this instead (note: formatting was lost here so posted as a Gist):

As for exporting the expression data:

exp <- exprs(gse)

returns a matrix where the column names are sample names.

ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by Neilfws48k
1

Hi Neilfws, How did you write this reply? first it is in the gitbub and second how to prepare them in gitbub? thanks.

ADD REPLYlink written 2.7 years ago by Shicheng Guo7.4k

Nicely done.

ADD REPLYlink written 4.4 years ago by Sean Davis25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 818 users visited in the last hour