Question: Error in converting the probsets
0
gravatar for A
4 weeks ago by
A3.9k
A3.9k wrote:

Hello

I have an expression set from Rat GSE43700

I am trying to annotate the probs but I get this error

> columns(hgu133plus2.db) # Features retrievable by AnnotationDbi::select
 [1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS"
 [6] "ENTREZID"     "ENZYME"       "EVIDENCE"     "EVIDENCEALL"  "GENENAME"    
[11] "GO"           "GOALL"        "IPI"          "MAP"          "OMIM"        
[16] "ONTOLOGY"     "ONTOLOGYALL"  "PATH"         "PFAM"         "PMID"        
[21] "PROBEID"      "PROSITE"      "REFSEQ"       "SYMBOL"       "UCSCKG"      
[26] "UNIGENE"      "UNIPROT"     
> anno_filt_eset2 <- AnnotationDbi::select(hgu133plus2.db, keys = (featureNames(filt_eset2)), columns = c("SYMBOL", "GENENAME", "ENTREZID"), keytype = "PROBEID")
Error in .testForValidKeys(x, keys, keytype, fks) : 
  None of the keys entered are valid keys for 'PROBEID'. Please use the keys method to see a listing of valid arguments.
>

Any help please?

affy array • 182 views
ADD COMMENTlink written 4 weeks ago by A3.9k

See the answer of James on Bioconductor:

ADD REPLYlink written 4 weeks ago by Kevin Blighe70k
4
gravatar for Kevin Blighe
4 weeks ago by
Kevin Blighe70k
Republic of Ireland
Kevin Blighe70k wrote:

Hello again. Can you let me know what is the output of:

featureNames(filt_eset2)

?

ADD COMMENTlink written 4 weeks ago by Kevin Blighe70k

Thank you

> featureNames(filt_eset2)
   [1] "1399067_at"   "1388081_at"   "1375569_at"   "1389201_at"   "1371559_at"
ADD REPLYlink written 4 weeks ago by A3.9k

Thank you. You said that it is Rat?; however, the GEO record indicates that it is human. Can you show all of your initial data processing steps?

ADD REPLYlink written 4 weeks ago by Kevin Blighe70k

Thank you

Yes this is Rat and as this is a public data this is my R object

https://www.dropbox.com/s/k338ooac2dpifg6/GSEA_Broad.R?dl=0

And full R script

https://www.dropbox.com/s/qxd8pan0jpuoq1m/Preranked_fGSEA%20%281%29.R?dl=0

And this is first lines of what I run before the error

gse <- getGEO("GSE43700", GSEMatrix = T, AnnotGPL = T)

show(gse)

head(exprs(gse[[1]]))[,1:5]

pdata <- as.data.frame(pData(gse[[1]]), stringsAsFactors = F)

all(colnames(exprs(gse[[1]]))==pdata$geo_accession)

plotMDS(exprs(gse[[1]]), labels = pdata$title)
plotMDS(exprs(gse[[1]]), labels = pdata$`donor_id:ch1`)



## Assessing rawdata from GEO using getGEOSuppFiles()

getGEOSuppFiles("GSE43700", makeDirectory = F)

list.files()

untar("GSE43700_RAW.tar", exdir = "rawdata")


list.files()

rawdata <- ReadAffy() # Import files

show(rawdata)

rma <- rma(rawdata, normalize = F, background = F) # Skips the normalization and background correction to illustrate their requirement

boxplot(exprs(rma), las=2)

dim(exprs(rawdata))
dim(exprs(rma)) #  rma function combines the individual probe intensities to a probeset intensity

rma <- rma(rawdata, normalize = T, background = T) # Normalization and background correction

boxplot(exprs(rma), las=2)

filt_eset2 <- featureFilter(rma, require.entrez = T, remove.dupEntrez = T)

dim(rma)
dim(filt_eset2)

## Add annotation information to the eSet featureData slot

columns(hgu133plus2.db) # Features retrievable by AnnotationDbi::select

anno_filt_eset2 <- AnnotationDbi::select(hgu133plus2.db, keys = (featureNames(filt_eset2)), columns = c("SYMBOL", "GENENAME", "ENTREZID"), keytype = "PROBEID")
ADD REPLYlink written 4 weeks ago by A3.9k

No, GSE43700 is definitely human data, not rat. Please check the GEO record: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE43700

The following lines are a problem:

untar("GSE43700_RAW.tar", exdir = "rawdata")

Here, you decompress to a directory called 'rawdata'

rawdata <- ReadAffy() # Import files

Here, you are reading in files from the current working directory, which, on your computer, seems to contain Rat data (probably from some other study that was indeed Rattus norvegicus).

What you need is this:

rawdata <- ReadAffy(filenames = list.files('rawdata/', full.names = TRUE))
ADD REPLYlink written 4 weeks ago by Kevin Blighe70k

Sorry

You are right

This is the right one GSE2457

I put CELL files from GSE2457 in rawdata directly and I run the code again but I got the same error

> anno_filt_eset2 <- AnnotationDbi::select(hgu133plus2.db, keys = (featureNames(filt_eset2)), columns = c("SYMBOL", "GENENAME", "ENTREZID"), keytype = "PROBEID")
Error in .testForValidKeys(x, keys, keytype, fks) : 
  None of the keys entered are valid keys for 'PROBEID'. Please use the keys method to see a listing of valid arguments.
>
ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by A3.9k

Excellent, so, you need to use rae230a.db in place of hgu133plus2.db

  • hgu133plus2.db is for the Affymetrix Human Genome U133 Plus 2.0 Array (GSE43700)
  • rae230a.db is for the Affymetrix Rat Expression 230A Array (GSE2457)
ADD REPLYlink written 4 weeks ago by Kevin Blighe70k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1994 users visited in the last hour
_