Question

Microarray Probes to Ensembl ID or Gene ID - clariomdhumanprobeset.db

0

Entering edit mode

4.6 years ago

Scott McKay ▴ 30

I am trying to pull DEG lists from multiple GEO datasets to cross analyze. Is there some way (in either R or python3) that will allow me to convert the probe IDs to something more universal? Ensembl ID, HGNC ID, or Gene ID? Please let me know. Thanks!

R python microarray probe gene • 2.7k views

ADD COMMENT • link updated 4.6 years ago by Kevin Blighe 87k • written 4.6 years ago by Scott McKay ▴ 30

1

Entering edit mode

You can try two things (assuming your dataset used Affymetrix Human Genome U133 Plus 2.0 Array):

Use BioMaRt

library(biomaRt)
ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")
probeids=c('200007_at', '200011_s_at', '200012_x_at')
getBM(attributes=c('affy_hg_u133_plus_2', 'hgnc_symbol'), 
      filters = 'affy_hg_u133_plus_2', 
      values = probeids, 
      mart = ensembl)

Use GEOquery

library(GEOquery)
gse <- getGEO(GSE_id,GSEMatrix=TRUE)
featureData <- as.data.frame(gse[[1]]@featureData@data)
ID_mapping <- featureData[,c(1,11)]

ADD REPLY • link 4.6 years ago by patelk26 ▴ 290

0

Entering edit mode

What should I do if the array is not in biomaRt?

ADD REPLY • link 4.6 years ago by Scott McKay ▴ 30

0

Entering edit mode

Which array is it? - try the manufacturer's website for the annotation. Also look at the Bioconductor annotation packages: https://www.bioconductor.org/packages/release/data/annotation/

ADD REPLY • link 4.6 years ago by Kevin Blighe 87k

0

Entering edit mode

Its from the Affymetrix Clariom D Assay

ADD REPLY • link 4.6 years ago by Scott McKay ▴ 30

0

Entering edit mode

If Human, then the annotation package that you want is: https://www.bioconductor.org/packages/release/data/annotation/html/clariomdhumanprobeset.db.html

ADD REPLY • link 4.6 years ago by Kevin Blighe 87k

0

Entering edit mode

Would I just download the annotation package and then run the same script as above and just swap the attribute and filter?

ADD REPLY • link 4.6 years ago by Scott McKay ▴ 30

0

Entering edit mode

Yes, I posted a solution below for that package.

ADD REPLY • link 4.6 years ago by Kevin Blighe 87k

0

Entering edit mode

Yes, but you need to know the array type that you are using. Take a look at this example for Affymetrix U133 Plus 2.0: A: Affymetrix Human Genome U133 Plus 2.0 Array

ADD REPLY • link 4.6 years ago by Kevin Blighe 87k

score 0 · Answer 1 · 2019-09-15

Response from this comment (above): C: Microarray Probes to Ensembl ID or Gene ID

Oh yes, you just need to use a simple lookup:

# install package (large; > 400MB)
BiocManager::install("clariomdhumanprobeset.db")

# load package
require('clariomdhumanprobeset.db')

# store the probe names (probably rownames of your expression object)
IDs <- c("PSR1700192228.hg.1","PSR1700192231.hg.1","PSR2000155490.hg.1",
  "JUC2000052683.hg.1","PSR0800175519.hg.1","JUC0800062325.hg.1")

# look up the probes
mapIds(
  clariomdhumanprobeset.db,
  keys = IDs,
  column = 'SYMBOL',
  keytype = 'PROBEID')

'select()' returned 1:1 mapping between keys and columns
PSR1700192228.hg.1 PSR1700192231.hg.1 PSR2000155490.hg.1 JUC2000052683.hg.1 
           "CD79B"            "CD79B"             "CDH4"             "CDH4" 
PSR0800175519.hg.1 JUC0800062325.hg.1 
         "RUNX1T1"          "RUNX1T1"

To see other options of what data can be returned, run:

keytypes(clariomdhumanprobeset.db)

 [1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS"
 [6] "ENTREZID"     "ENZYME"       "EVIDENCE"     "EVIDENCEALL"  "GENENAME"    
[11] "GO"           "GOALL"        "IPI"          "MAP"          "OMIM"        
[16] "ONTOLOGY"     "ONTOLOGYALL"  "PATH"         "PFAM"         "PMID"        
[21] "PROBEID"      "PROSITE"      "REFSEQ"       "SYMBOL"       "UCSCKG"      
[26] "UNIGENE"      "UNIPROT"

There is also an example in Section 17.4.4 Gene annotation of limma.

Kevin