Question: AnnotationDbi returns different list of symbols from directly derived list of database itself
21 months ago by
New York, USA
zephyr_falcon80 wrote:


I'm trying to annotate gene symbols next to probe IDs (Affymetrix Mouse Gene 1.0-ST Array).

I used "mogene10sttranscriptcluster.db" package (v8.7.0) of R for the annotation.

But here's the problem.

1) Using mogene10sttranscriptcluster.db directly


a <- contents(mogene10sttranscriptclusterSYMBOL)

# a$'10344741'
# [1] NA

2) Using AnnotationDbi to extract the info


k <- keys(mogene10sttranscriptcluster.db, keytype = "PROBEID")
b <- mapIds(mogene10sttranscriptcluster.db, keys=k, column=c("SYMBOL"), keytype="PROBEID")

# 10344741
# "Hnrnpa3" 

length(a) = length(b) = 35556

But there are some symbols not in the (1) but in the (2).

They both used the same database - mogene10sttranscriptcluster.db, but how did they get different results?

Does the AnnotationDbi converts probe ids to some other ids and then convert them to gene symbols?

The second one seems to have more symbols, so that's the one I have to use?

I'm very confused right now.

modified 21 months ago by zx87549.9k • written 21 months ago by zephyr_falcon80
21 months ago by
New York, USA
zephyr_falcon80 wrote:

I found my own answer.

It seems like the mogene10sttranscriptcluster.db utilizes for annotation.

And the version of is different between mogene10sttranscriptcluster.db and AnnotationDbi.

I found this because when I loaded different version of, the same version of mogene10sttranscriptcluster.db (v8.7.0) produces different results.

So, check the version of your

written 21 months ago by zephyr_falcon80
