Question: How to get data from GEO
0
gravatar for wenbinm
10 months ago by
wenbinm0
USA
wenbinm0 wrote:

Hi there,

I am using R package GEOquery to download data from GEO. I use

library(GEOquery)
library(Biobase)
data <- getGEO('GSE2034')
data <- as.data.frame(exprs(data[[1]])) #extracting expression data

Then I have a file named "GSE2034_family.soft.gz" downloaded. So far this works well. But the other time I tried directly reading "GSE2034_family.soft.gz":

library(GEOquery)
library(Biobase)
data <- getGEO(filename = 'GSE2034_family.soft.gz' )
data <- as.data.frame(exprs(data[[1]]))

Then I got

"Error in data[[1]] : this S4 class is not subsettable"

Does anyone know how to fix this?

Thank you!

microarray • 1.2k views
ADD COMMENTlink modified 10 months ago by Kevin Blighe45k • written 10 months ago by wenbinm0
2
gravatar for Kevin Blighe
10 months ago by
Kevin Blighe45k
Kevin Blighe45k wrote:

Edit (1st September 2018): see a quick distinction of the GEO files, here: A: Parsing values from GSE file

----------------------------

With your first chunk of code, you are obtaining the 'series matrix' data, which, in the vast majority of cases, is already normalized and transformed by log (base 2). Your object data is stored in an ExpressionSet object, which is the standard way to store microarray data:

data <- getGEO('GSE2034', GSEMatrix=TRUE)

data

$GSE2034_series_matrix.txt.gz
ExpressionSet (storageMode: lockedEnvironment)
assayData: 22283 features, 286 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: GSM36777 GSM36778 ... GSM37062 (286 total)
  varLabels: title geo_accession ... bone relapses (1=yes, 0=no):ch1
    (28 total)
  varMetadata: labelDescription
featureData
  featureNames: 1007_s_at 1053_at ... AFFX-TrpnX-M_at (22283 total)
  fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL96

You can proceed to downstream analyses with this data, accessed via exprs[data[[1]]]

------------------------------------------------

Note that, on the home page for GSE2034 (HERE), there's a big blue button at the bottom labelled ANALYZE WITH GEO2R

j

Click on that and then go to the R script tab. There, you'll find a ready-made way to read in what is [usually] the normalized data.

Kevin

ADD COMMENTlink modified 10 months ago • written 10 months ago by Kevin Blighe45k

Thank you for your response! I am sorry I made a mistake here. library(GEOquery) will download 'series matrix' data. I met the problem when I try to directly read in downloaded series matrix data:

data <- getGEO(filename = 'GSE2034_series_matrix.txt.gz' )
data <- as.data.frame(exprs(data[[1]]))

And got the error. I am just looking for a way to use local files instead of downloading everytime. data <- getGEO('GSE2034') will download the data again right?

ADD REPLYlink written 10 months ago by wenbinm0

What is the error? Yes, you can just download the series matrix file and then load it with:

gse <- getGEO(filename="GSE2034_series_matrix.txt.gz")

Then, access the normalised expression values with:

exprs(gse)

...or:

exprs(gse[[1]])

--------------------------------

If you run getGEO('GSE2034', GSEMatrix=TRUE) twice in the same session, then it will use the data that was already downloaded:

data <- getGEO('GSE2034', GSEMatrix=TRUE)
Found 1 file(s)
GSE2034_series_matrix.txt.gz
tentando a URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE2nnn/GSE2034/matrix/GSE2034_series_matrix.txt.gz'
Content type 'application/x-gzip' length 14344700 bytes (13.7 MB)
==================================================
downloaded 13.7 MB


data <- getGEO('GSE2034', GSEMatrix=TRUE)
Found 1 file(s)
GSE2034_series_matrix.txt.gz
Using locally cached version: /tmp/RtmppE74xT/GSE2034_series_matrix.txt.gz
ADD REPLYlink written 10 months ago by Kevin Blighe45k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1954 users visited in the last hour