Data extraction from GEO onto R
1
0
Entering edit mode
3.8 years ago
silsie645 ▴ 20

I am trying to extract my information from interest from the dataset GSE33113 from GEO onto R but R doesnot seem to recognize my data input. I tried:

gset<-getGEO('GSE33113',GSEMatrix=TRUE,getGPL=FALSE)
library(survival)
library(GEOquery)
library(RegParallel)
library(Biobase)
library(survminer)

#loading platform data 
gset<-getGEO('GSE33113',GSEMatrix=TRUE,getGPL=FALSE)

x<-exprs(gset[[1]])

#remove Affymetrix control probes 
x<-x[-grep('^AFFX', rownames(x)),]

#transform the expression data to Z scores
x<-t(scale(t(x)))

#extracting information of interest
idx<-which(colnames(pData(gset[[1]]))%in%c('Age_At_Diagnosis:ch1','Sex:ch1','Disease status:ch1','Tissue.ch1'))
metadata<-data.frame(pData(gset[[1]])[,idx],row.names=rownames(pData(gset[[1]])))

but the metadata generated only contained the 'sex' data even though I got this response:

#extract information of interest from the phenotype data 
> idx<-which(colnames(pData(gset[[1]]))%in%c('AgeAtDiagnosis:ch1','Death:ch1','Gender:ch1','Grading:ch1','LymphNodesInvaded:ch1','OverallSurvival_months:ch1','TumorFreeSurvival_months:ch1'))
> metadata1<-data.frame(pData(gset[[1]])[,idx],row.names=rownames(pData(gset[[1]])))
R • 1.1k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Thank you!

ADD REPLY
0
Entering edit mode

Ok thanks. I just saw your reply.

ADD REPLY
0
Entering edit mode

Sorry the idx response was:

#extracting information of interest
> idx<-which(colnames(pData(gset[[1]]))%in%c('Age_At_Diagnosis:ch1','Sex:ch1','Disease status:ch1','Tissue.ch1'))
> metadata<-data.frame(pData(gset[[1]])[,idx],row.names=rownames(pData(gset[[1]])))

With 'Age_At_Diagnosis' column, I also tried 'Age at diagnosis' as seen in the dataset and 'AgeAtDiagnosis' but I got the same response.

ADD REPLY
0
Entering edit mode
3.8 years ago
dsull ★ 5.8k

Run the following:

print(colnames(pData(gset[[1]])))

You'll find that correct names of all the columns, e.g. the correct column name is age at diagnosis:ch1 not Age_At_Diagnosis:ch1

ADD COMMENT

Login before adding your answer.

Traffic: 1907 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6