Question: 100 % Coverage for Gene ID Mapping
0
gravatar for shinobee
21 months ago by
shinobee40
shinobee40 wrote:

I've encounter with some gene ids, not having Entrez ID correspondence and some probe IDs may have same Entrez ID. How do you solve this problem?

biocLite("KEGGdzPathwaysGEO")  
#Alzheimer  
# Title: Incipient Alzheimer's Disease: Microarray Correlation Analyses 
# URL: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1297 
# PMIDs: 14769913 
# 9 control 7 disease

data(GSE1297)    
data <- GSE1297@assayData$exprs
grp <- as.integer(as.factor(GSE1297@phenoData@data$Group)) - 1

library('hgu133a.db')    # here use your chip hgu133a.db
g_ids <- mapIds(hgu133a.db, keys=rownames(data), c("ENTREZID"), keytype="PROBEID")

# Many using same entrez id 
> length(which(table(g_ids)>1))
[1] 4805
> sort(table(g_ids),decreasing = T)[1:3]
g_ids
10730  3514  3077 
19    17    13 

 > g_ids[which(g_ids == 10730)]
 201351_s_at   201352_at 209671_x_at 210972_x_at 211902_x_at   213830_at 215524_x_at   215540_at
  "10730"     "10730"     "10730"     "10730"     "10730"     "10730"     "10730"     "10730" 
 215769_at   215796_at   216133_at 216191_s_at 216304_x_at   216540_at   217056_at 217063_x_at 
   "10730"     "10730"     "10730"     "10730"     "10730"     "10730"     "10730"     "10730" 
 217065_at 217143_s_at   217397_at 
   "10730"     "10730"     "10730" 

  # no mapping to entrez id
  lengthwhichis.na(g_ids)))
  [1] 1165
ADD COMMENTlink modified 21 months ago • written 21 months ago by shinobee40

See if this helps: How do I map Affymetrix probe IDs to gene symbols in R? (check the linked Blog page therein).

ADD REPLYlink modified 21 months ago • written 21 months ago by genomax74k
0
gravatar for shinobee
21 months ago by
shinobee40
shinobee40 wrote:

They use another function like:

ae.annots <- AnnotationDbi::select(
  x       = hgu133a.db,
  keys    = rownames(data),
  columns = "ENTREZID",
  keytype = "PROBEID"
)

But still we encounter the problems, I mentioned:

One probe ID to many Entrez ID mapping

sort(table(ae.annots$ENTREZID),decreasing = T)[1:3]

3500 3507 3493 
  36   32   22

and no mapping at all.

sumis.na(ae.annots$ENTREZID))
[1] 1165
ADD COMMENTlink written 21 months ago by shinobee40

It is perhaps not unusual to find some probes no longer mapping to extant genes. It is possible that underlying sequence was revised over time and may no longer exist in current genome builds.

ADD REPLYlink written 21 months ago by genomax74k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1400 users visited in the last hour