Converting Ensembl stable ids to symbol, is version ID important for expression data?
0
0
Entering edit mode
5.0 years ago
Berghopper ▴ 20

Dear Bioconductor,

For my project I want to analyze a bunch of long non-coding RNA's (lncRNA) with TCGA (filters: transcriptome profiling, gene expression quantification, RNA-seq, FPKM) expression data of specific cancers.

The resulting data for each gene uses versioned Ensembl ids and I need to convert these to SYMBOL (as this is what most lncRNA's I use are listed in).

However, when converting, versioned Ensembl IDs tend to have a lot less matching SYMBOLS than using regular IDs to match on. Not only that, but using versioned IDs practically NONE of my lncRNAs will be in present in the dataset when matching their symbol IDs.

Now for my question: Do my results get impeded/will they be incorrect if I just ignore ensembl version numbers and translate to symbol?

As Ensembl lists it, the transcripts that are used changes for each versioning number, what does this exactly mean for the TCGA's expression data?

Another question: some of my lncRNA's were referencing ENST ids (transcripts) and I was able to convert some of them to symbol. Is there incorrect assumptions made when converting IDs like this as well?

I know this question might be nitpicky, but I don't want faulty results/make incorrect assumptions.

Thanks in advance for any answers :)

RNA-Seq Ensembl IDs conversion symbol lncRNAs • 971 views
ADD COMMENT

Login before adding your answer.

Traffic: 2691 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6