Question: How To Map Isoform Ids Of Transcript To Entrez Ids ?
gravatar for jack
6.1 years ago by
jack450 wrote:


I've got RNA-seq data from TCGA. I have gene expression level and also isoform expression level. I want to know how can I map the isoform ID of transcripts to Entreize gene ID.

my isoform IDs looks like as follow:

isoform_id    normalized_count
uc011lsn.1    0.0000
uc010unu.1    20.1848
uc010uoa.1    7.1561
uc002bgz.2    36.1698
uc002bic.2    0.0000
uc010zzl.1    188.5822
uc001jiu.2    1085.9445
ngs tcga genomics • 5.7k views
ADD COMMENTlink modified 6.1 years ago by Neilfws48k • written 6.1 years ago by jack450

I would normally recommend either BioMart or the UCSC Table Browser for this task. But before we go any further: none of those isoform IDs appear to be valid? I found some corresponding Entrez IDs from this mailing list and those IDs are not valid either, having been replaced.

ADD REPLYlink modified 6.1 years ago • written 6.1 years ago by Neilfws48k
gravatar for mikhail.shugay
6.1 years ago by
Czech Republic, Brno, CEITEC
mikhail.shugay3.4k wrote:

Those are UCSC isoform ids. So either get a corresponding table from UCSC GB by selecting track=UCSC genes and table=knownToKeggEntrez, then use the table as a dictionary to remap. You can also paste the list to gene id conversion tool, such as DAVID.

I'm not sure if you strictly need mapping to Entrez Id, or just group isoforms by gene. In this case I recommend switching to RefSeq IDs and use RefSeq track to get gene names. The table for this conversion could be obtained by selecting track=RefSeq Genes and table=kgXref.

ADD COMMENTlink written 6.1 years ago by mikhail.shugay3.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2208 users visited in the last hour