Error converting from the ENTREZ GENE ID to GENE SYMBOLS
1
0
Entering edit mode
5.0 years ago

Hi,

I am trying the convert the ENTREZ GENE ID to the GENE SYMBOLS using the "org.Hs.eg.db" library and getting error message as shown below. Please assist me with this and also the code is shown below.

Error in .checkKeysAreWellFormed(keys) : 
  keys must be supplied in a character vector with no NAs
  
#GET GSE soft file and matrix
GSE59491 <- getGEO("GSE59491", GSEMatrix=FALSE)

#sample names
names(GSMList(GSE59491))

#platforms used in this GSE
names(GPLList(GSE59491 ))
GSM.platforms <- lapply(GSMList(GSE59491 ),function(x) {Meta(x)$platform}) 
data.frame(GSM.platforms)

#example of an GSM experession vector 
Table(GSMList(GSE59491)[[1]])[1:100,]

#example of gene anotation data from GPL of GSM 1
Probe.anotaion.table <- Table(GPLList(GSE59491)[[1]])
#Probeset extrated from GPL of GSM 1 
probesets <- as.character(Probe.anotaion.table$ID)
probesets <- Table(GPLList(GSE59491)[[1]])$ID
rownames(Probe.anotaion.table) <- make.names(Probe.anotaion.table$ID, unique=TRUE)
rownames(Probe.anotaion.table) =gsub(rownames(Probe.anotaion.table),pattern = "X",replacement = "")

#creating the expression matrix ordered by the GPL order of probes
data.matrix <- do.call('cbind',lapply(GSMList(GSE59491),function(x) {
  tab <- Table(x)
  mymatch <- match(probesets,tab$ID_REF)
  return(tab$VALUE[mymatch])
}))
data.matrix <- apply(data.matrix,2,function(x) {as.numeric(as.character(x))})


rownames(data.matrix) <- probesets
data.matrix <- data.matrix[complete.cases(data.matrix), ]
data.matrix[1:5,]
##If log transformed##
data.matrix= 2^data.matrix
data.matrix[data.matrix<10]=10


###match probe and gene names
ProbeID.GSE59491 <- Probe.anotaion.table[which(rownames(Probe.anotaion.table)%in%rownames(data.matrix)),]
rownames(ProbeID.GSE59491)==rownames(data.matrix)

dim(data.matrix[complete.cases(ProbeID.GSE59491), ])


library(org.Hs.eg.db)
library(annotate)
ProbeID.GSE59491$Symbol = getSYMBOL(ProbeID.GSE59491$ENTREZ_GENE_ID, data='org.Hs.eg.db')

Error in .checkKeysAreWellFormed(keys) : keys must be supplied in a character vector with no NAs

R org.Hs.eg.db affymetrix annotations • 2.3k views
ADD COMMENT
1
Entering edit mode

So.... do you have any NAs in ProbeID.GSE59491$ENTREZ_GENE_ID? What's the variable type?

ADD REPLY
0
Entering edit mode

@WouterDeCoster

Hi,

There are no NAs and the class is also shown below:

> sumis.na(ProbeID.GSE59491))
[1] 0
> sumis.na(ProbeID.GSE59491$ENTREZ_GENE_ID))
[1] 0


> class(ProbeID.GSE59491)
[1] "data.frame"
> class(ProbeID.GSE59491$ENTREZ_GENE_ID)
[1] "numeric"
> str(ProbeID.GSE59491)
'data.frame':   25088 obs. of  2 variables:
 $ ID            : chr  "100009613_at" "100009676_at" "10000_at" "10001_at" ...
 $ ENTREZ_GENE_ID: num  1e+08 1e+08 1e+04 1e+04 1e+04 ...
 - attr(*, "spec")=
  .. cols(
  ..   ID = col_character(),
  ..   ENTREZ_GENE_ID = col_double()
  .. )
ADD REPLY
1
Entering edit mode
5.0 years ago

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your reaction but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

So your ProbeID.GSE59491$ENTREZ_GENE_ID is "numeric" while the error message states keys must be supplied in a character vector.

ADD COMMENT
0
Entering edit mode

@ WouterDeCoster

Hi,

Sure. And, converting from numeric to character resolved the issue. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 992 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6