Question: Error converting from the ENTREZ GENE ID to GENE SYMBOLS
0
gravatar for mohammedtoufiq91
29 days ago by
mohammedtoufiq9190 wrote:

Hi,

I am trying the convert the ENTREZ GENE ID to the GENE SYMBOLS using the "org.Hs.eg.db" library and getting error message as shown below. Please assist me with this and also the code is shown below.

Error in .checkKeysAreWellFormed(keys) : 
  keys must be supplied in a character vector with no NAs
  
#GET GSE soft file and matrix
GSE59491 <- getGEO("GSE59491", GSEMatrix=FALSE)

#sample names
names(GSMList(GSE59491))

#platforms used in this GSE
names(GPLList(GSE59491 ))
GSM.platforms <- lapply(GSMList(GSE59491 ),function(x) {Meta(x)$platform}) 
data.frame(GSM.platforms)

#example of an GSM experession vector 
Table(GSMList(GSE59491)[[1]])[1:100,]

#example of gene anotation data from GPL of GSM 1
Probe.anotaion.table <- Table(GPLList(GSE59491)[[1]])
#Probeset extrated from GPL of GSM 1 
probesets <- as.character(Probe.anotaion.table$ID)
probesets <- Table(GPLList(GSE59491)[[1]])$ID
rownames(Probe.anotaion.table) <- make.names(Probe.anotaion.table$ID, unique=TRUE)
rownames(Probe.anotaion.table) =gsub(rownames(Probe.anotaion.table),pattern = "X",replacement = "")

#creating the expression matrix ordered by the GPL order of probes
data.matrix <- do.call('cbind',lapply(GSMList(GSE59491),function(x) {
  tab <- Table(x)
  mymatch <- match(probesets,tab$ID_REF)
  return(tab$VALUE[mymatch])
}))
data.matrix <- apply(data.matrix,2,function(x) {as.numeric(as.character(x))})


rownames(data.matrix) <- probesets
data.matrix <- data.matrix[complete.cases(data.matrix), ]
data.matrix[1:5,]
##If log transformed##
data.matrix= 2^data.matrix
data.matrix[data.matrix<10]=10


###match probe and gene names
ProbeID.GSE59491 <- Probe.anotaion.table[which(rownames(Probe.anotaion.table)%in%rownames(data.matrix)),]
rownames(ProbeID.GSE59491)==rownames(data.matrix)

dim(data.matrix[complete.cases(ProbeID.GSE59491), ])


library(org.Hs.eg.db)
library(annotate)
ProbeID.GSE59491$Symbol = getSYMBOL(ProbeID.GSE59491$ENTREZ_GENE_ID, data='org.Hs.eg.db')

Error in .checkKeysAreWellFormed(keys) : keys must be supplied in a character vector with no NAs

ADD COMMENTlink modified 29 days ago • written 29 days ago by mohammedtoufiq9190
1

So.... do you have any NAs in ProbeID.GSE59491$ENTREZ_GENE_ID? What's the variable type?

ADD REPLYlink written 29 days ago by WouterDeCoster42k

@WouterDeCoster

Hi,

There are no NAs and the class is also shown below:

> sumis.na(ProbeID.GSE59491))
[1] 0
> sumis.na(ProbeID.GSE59491$ENTREZ_GENE_ID))
[1] 0


> class(ProbeID.GSE59491)
[1] "data.frame"
> class(ProbeID.GSE59491$ENTREZ_GENE_ID)
[1] "numeric"
> str(ProbeID.GSE59491)
'data.frame':   25088 obs. of  2 variables:
 $ ID            : chr  "100009613_at" "100009676_at" "10000_at" "10001_at" ...
 $ ENTREZ_GENE_ID: num  1e+08 1e+08 1e+04 1e+04 1e+04 ...
 - attr(*, "spec")=
  .. cols(
  ..   ID = col_character(),
  ..   ENTREZ_GENE_ID = col_double()
  .. )
ADD REPLYlink written 29 days ago by mohammedtoufiq9190
1
gravatar for WouterDeCoster
29 days ago by
Belgium
WouterDeCoster42k wrote:

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your reaction but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

So your ProbeID.GSE59491$ENTREZ_GENE_ID is "numeric" while the error message states keys must be supplied in a character vector.

ADD COMMENTlink written 29 days ago by WouterDeCoster42k

@ WouterDeCoster

Hi,

Sure. And, converting from numeric to character resolved the issue. Thank you.

ADD REPLYlink written 29 days ago by mohammedtoufiq9190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1714 users visited in the last hour