Is there any gene nomenclature conversion tool that recognizes old names?
2
2
Entering edit mode
3.1 years ago
emibaffo ▴ 50

Hello! I have a list of about 6200 genes in symbol nomenclature (e.g: TP53) which results from having done a DE analysis on the LIHC-TCGA data.

I needed to convert it to Entrez in order to continue my workflow so I used the org.hs.eg.db package, but there were about 800 genes that couldn't be converted. When I took a closer glance and googled some of these genes, I saw that the reason was that they were annotated with an old name (e.g: MCUB was annotated as CCDC109B and LAMTOR1 as C11orf59). This is fairly easy to find because the NCBI gives the official symbol and then below an "Also known as" title with other non-official and/or former names.

Now, I would like to convert these genes to the official name and ultimately to Entrez since they represent about 13% of my DE genes and I think it's a shame to just ignore them, but obviously doing it manually would take me forever.

Is there any tool or resource which recognizes these unofficial former names and can convert them to the official symbol or to Entrez or to any other official nomenclature?

genomics R annotation • 1.5k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
4
Entering edit mode
3.1 years ago
GenoMax 141k

You can use EntrezDirect:

Official symbol is in first line and the Entrez gene ID is in last line.

$ esearch -db gene -query "CCDC109B AND human [orgn]" | efetch

1. MCUB
Official Symbol: MCUB and Name: mitochondrial calcium uniporter dominant negative subunit beta [Homo sapiens (human)]
Other Aliases: CCDC109B
Other Designations: calcium uniporter regulatory subunit MCUb, mitochondrial; coiled-coil domain containing 109B; coiled-coil domain-containing protein 109B; mitochondrial calcium uniporter dominant negative beta subunit; mitochondrial calcium uniporter regulatory subunit MCUb
Chromosome: 4; Location: 4q25
Annotation: Chromosome 4 NC_000004.12 (109560246..109688719)
ID: 55013
ADD COMMENT
1
Entering edit mode
3.1 years ago

The HUGO Gene Nomenclature Committee (HGNC) maintains the HUGO Gene Nomenclature. You can download their most up to date information here, including former names, alias, and yes, entrez_id:

hgnc_id symbol  name    locus_group locus_type  status  location    location_sortable   alias_symbol    alias_name  prev_symbol prev_name   gene_family gene_family_id  date_approved_reserved  date_symbol_changed date_name_changed   date_modified   entrez_id   ensembl_gene_id vega_id ucsc_id ena refseq_accession    ccds_id uniprot_ids pubmed_id   mgd_id  rgd_id  lsdb    cosmic  omim_id mirbase homeodb snornabase  bioparadigms_slc    orphanet    pseudogene.org  horde_id    merops  imgt    iuphar  kznf_gene_catalog   mamit-trnadb    cd  lncrnadb    enzyme_id   intermediate_filament_db    rna_central_ids lncipedia   gtrnadb agr
ADD COMMENT

Login before adding your answer.

Traffic: 2021 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6