Question: Why some probes have "NA" for gene symbol and Entrez ID?
gravatar for Rahil
3.8 years ago by
Rahil170 wrote:

Hi, I converted the affy probes to official gene symbol by using library (annotate). However, many of them have "NA" in stead of gene symbols. Could anyone tell me why and how I have to deal with them? Any help would be appreciated!

ADD COMMENTlink modified 3.8 years ago by Biostar ♦♦ 20 • written 3.8 years ago by Rahil170

I don't know what annotate does but if it does mapping using some sort of ID conversion process this could explain why you're missing genes it if it uses outdated IDs. For example the documentation for annotate mentions LocusLink which has been retired for over 10 years. I would pick a reference genome and map the probes to it.

ADD REPLYlink written 3.8 years ago by Jean-Karim Heriche23k

Some probesets were probably designed based on ESTs and may not map to a known gene. Have a look at the Affymetrix website, get the latest Affymetrix annotation files for the microarray you are dealing with, and check whether the probeset maps to a known gene or not.

ADD REPLYlink written 3.8 years ago by mastal5112.0k

How to get the latest affymetrix annotation for hgu133plus2? Many thanks for your help.

ADD REPLYlink written 3.8 years ago by Rahil170

The Affymetrix website is at

Their online database for microarray-related annotations and sequences is the NetAffx Analysis Centre. You will have to login with a user name and password, but then as well as querying data for individual probesets, you will be able to download a text file with the latest annotations (na.36) for the hgu133plus2 array, as well as files with all the probe sequences.

You may actually be able to download the na.36 annotation files without having to register.

ADD REPLYlink written 3.8 years ago by mastal5112.0k

Thank you. I am using hgu133plus2.db and annotate package to convert affymetrix probeset IDs to gene symbols.

gene.symbols <- getSYMBOL(rownames(probeset.list), "hgu133plus2")
results <- cbind(probeset.list, gene.symbols)
write.table(results, "results.txt", sep="\t", quote=FALSE)

If I’m not mistaken there are about 11 probes which are representative a specific gene. So, why there isn’t any gene symbol for some probeset IDs?

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Rahil170

According to the Affymetrix website, the hgu133plus2 probes were designed against a hodgepodge of sequences and their reference gene set was a UniGene version from 2001. So it is not surprising that probes don't match genes 15 years later as human genome annotations have evolved a bit in 15 years. If you want to understand what's going on, get the probe sequences and map them to an annotated genome reference of your choice. As another option, EnsEMBL has mapped probes and makes them available via BioMart.

ADD REPLYlink written 3.8 years ago by Jean-Karim Heriche23k

check this post : Converting Affymetrix Probeset Ids To Symbols Or Ensembl Ids

ADD REPLYlink written 3.8 years ago by Ron1000
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 774 users visited in the last hour