Cannot convert gene ID to gene symbol
2
0
Entering edit mode
3.3 years ago
PNTick • 0

I am using the microarray data from GEO (GSE116486).

For example, "1563061_at" is cannot be converted to gene symbol. I used DAVID, metascape and other available ID convert website but, the results are the same(couldn't get the genesymbol).

My question is "Are there still exists the gene with no symbol?" If yes, does it mean that there are chances or potential to find the novelty?

I would appreciate it if someone could answer.

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE116486

gene • 992 views
ADD COMMENT
0
Entering edit mode

Thank you very much for answering my question. I see...

Thank you very much with R code! All my cobweb are just gone now!

ADD REPLY
0
Entering edit mode
3.3 years ago

Yes, predicted genes may not necessarily have yet been assigned an ID by Ensembl, Entrez, HGNC, et cetera.

This probe genuinely has no matches:

require('hgu133plus2.db')
annotLookup <- select(hgu133plus2.db, keys = c('1563061_at'),
  columns = c('PROBEID', 'ENSEMBL', 'SYMBOL', 'ENTREZID', 'REFSEQ'))
annotLookup
     PROBEID ENSEMBL SYMBOL ENTREZID REFSEQ
1 1563061_at    <NA>   <NA>     <NA>   <NA>

If we look at the UCSC Genome Browser, we can begin to see why there are no matches: https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&lastV...

That 'gene', BC039487, is hypothetical.

Kevin

ADD COMMENT
0
Entering edit mode

Adding on this, it could also be a control probe that serves as internal control to check how probes that should not change behave after normalization/testing etc. So if you do not find a match in the browser then it is probably a control probe.

ADD REPLY
0
Entering edit mode

Actually I didin't know what the control probe is all about...!

Hmm, so, can I think of the control probe as being artificially created to see if the condition(normal or diseased) remains the same after normalization or testing?

ADD REPLY
0
Entering edit mode

They will usually be left in the dataset, even after normalisation. However, these probes should begin with AFFX or BGP - see here: A: Control probe sets in Affymetrix ST microarrays

...and you should indeed remove these after normalisation.

ADD REPLY
0
Entering edit mode
3.3 years ago

Hi :)

I am not sure whether this serves as a valuable answer, but when using R, this might be an easy task. There are several Bioconductor packages, which will convert GeneIDs to Gene Names rather easily.

e.g.

loading some packages first

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("Homo.sapiens") library(Homo.sapiens) require(GenomicRanges)

then run:

genes(TxDb.Hsapiens.UCSC.hg19.knownGene) [this command saves all the gene names and IDs into a data frame]
gene_names = as.data.frame(org.Hs.egSYMBOL) [this command saves all the gene names and IDs into a data frame]

then simply merge your IDs with gene_names, like:

out = merge(gene_names, your_ID's, by.x = 'ID', by.y = 'ID', all.y = T)

note: (the syntax above just serves as example): 'by.x' and 'by.y' might differ in your code; as I am not entirely sure about the column names, etc.

best, chris.

ADD COMMENT

Login before adding your answer.

Traffic: 2512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6