Question: Error converting from the ENTREZ GENE ID to GENE SYMBOLS
0
gravatar for mohammedtoufiq91
10 months ago by
mohammedtoufiq91110 wrote:

Hi,

I am trying the convert the ENTREZ GENE ID to the GENE SYMBOLS using the "org.Hs.eg.db" library and getting error message as shown below. Please assist me with this and also the code is shown below.

Error in .checkKeysAreWellFormed(keys) : 
  keys must be supplied in a character vector with no NAs
  
#GET GSE soft file and matrix
GSE59491 <- getGEO("GSE59491", GSEMatrix=FALSE)

#sample names
names(GSMList(GSE59491))

#platforms used in this GSE
names(GPLList(GSE59491 ))
GSM.platforms <- lapply(GSMList(GSE59491 ),function(x) {Meta(x)$platform}) 
data.frame(GSM.platforms)

#example of an GSM experession vector 
Table(GSMList(GSE59491)[[1]])[1:100,]

#example of gene anotation data from GPL of GSM 1
Probe.anotaion.table <- Table(GPLList(GSE59491)[[1]])
#Probeset extrated from GPL of GSM 1 
probesets <- as.character(Probe.anotaion.table$ID)
probesets <- Table(GPLList(GSE59491)[[1]])$ID
rownames(Probe.anotaion.table) <- make.names(Probe.anotaion.table$ID, unique=TRUE)
rownames(Probe.anotaion.table) =gsub(rownames(Probe.anotaion.table),pattern = "X",replacement = "")

#creating the expression matrix ordered by the GPL order of probes
data.matrix <- do.call('cbind',lapply(GSMList(GSE59491),function(x) {
  tab <- Table(x)
  mymatch <- match(probesets,tab$ID_REF)
  return(tab$VALUE[mymatch])
}))
data.matrix <- apply(data.matrix,2,function(x) {as.numeric(as.character(x))})


rownames(data.matrix) <- probesets
data.matrix <- data.matrix[complete.cases(data.matrix), ]
data.matrix[1:5,]
##If log transformed##
data.matrix= 2^data.matrix
data.matrix[data.matrix<10]=10


###match probe and gene names
ProbeID.GSE59491 <- Probe.anotaion.table[which(rownames(Probe.anotaion.table)%in%rownames(data.matrix)),]
rownames(ProbeID.GSE59491)==rownames(data.matrix)

dim(data.matrix[complete.cases(ProbeID.GSE59491), ])


library(org.Hs.eg.db)
library(annotate)
ProbeID.GSE59491$Symbol = getSYMBOL(ProbeID.GSE59491$ENTREZ_GENE_ID, data='org.Hs.eg.db')

Error in .checkKeysAreWellFormed(keys) : keys must be supplied in a character vector with no NAs

ADD COMMENTlink modified 10 months ago • written 10 months ago by mohammedtoufiq91110
1

So.... do you have any NAs in ProbeID.GSE59491$ENTREZ_GENE_ID? What's the variable type?

ADD REPLYlink written 10 months ago by WouterDeCoster44k

@WouterDeCoster

Hi,

There are no NAs and the class is also shown below:

> sumis.na(ProbeID.GSE59491))
[1] 0
> sumis.na(ProbeID.GSE59491$ENTREZ_GENE_ID))
[1] 0


> class(ProbeID.GSE59491)
[1] "data.frame"
> class(ProbeID.GSE59491$ENTREZ_GENE_ID)
[1] "numeric"
> str(ProbeID.GSE59491)
'data.frame':   25088 obs. of  2 variables:
 $ ID            : chr  "100009613_at" "100009676_at" "10000_at" "10001_at" ...
 $ ENTREZ_GENE_ID: num  1e+08 1e+08 1e+04 1e+04 1e+04 ...
 - attr(*, "spec")=
  .. cols(
  ..   ID = col_character(),
  ..   ENTREZ_GENE_ID = col_double()
  .. )
ADD REPLYlink written 10 months ago by mohammedtoufiq91110
1
gravatar for WouterDeCoster
10 months ago by
Belgium
WouterDeCoster44k wrote:

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your reaction but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

So your ProbeID.GSE59491$ENTREZ_GENE_ID is "numeric" while the error message states keys must be supplied in a character vector.

ADD COMMENTlink written 10 months ago by WouterDeCoster44k

@ WouterDeCoster

Hi,

Sure. And, converting from numeric to character resolved the issue. Thank you.

ADD REPLYlink written 10 months ago by mohammedtoufiq91110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 626 users visited in the last hour