Question: Safe to convert entrez gene id to ensembl gene id
1
gravatar for tmms
7 weeks ago by
tmms10
tmms10 wrote:

Hello

I have performed a differential expression analysis with DESeq2 in R. I obtained a list of differentially expressed genes. These genes are identified by ensembl gene id (ENSG00....).

I want to use these results for gene-set enrichment analysis and pathway analysis in R. However, almost all of the packages I encountered to perform such an analysis require the genes to have entrez gene id. I know that you can use biomaRt to construct a table to convert all these identifiers to each other. I observed that there is no unique mapping between ensembl and entrez id. Some ensembl is map to several entrez id and vice versa and that some ensembl id map to no entrez id.

That made me wonder if it "safe" to convert gene sets with genes with entrez id to gene sets with genes with ensembl id.

Other suggestions to fix this are also welcome. I already tried filtering out all genes with no entrez id and that have a duplicate ensembl or entrez id before the differential expression, but then I lost some differentially expressed genes.

Thanks in advance.

Tim

kegg go rna-seq biomart • 88 views
ADD COMMENTlink modified 7 weeks ago by mark.ziemann1.2k • written 7 weeks ago by tmms10

Have you tried WebGestaltR? I know WebGestalt (the online version) works well with both Entrez and Ensembl IDs, so it makes sense that the R package would as well.

Disclaimer: I previously worked for the team that developed Webgestalt.

ADD REPLYlink written 7 weeks ago by RamRS26k
1
gravatar for mark.ziemann
7 weeks ago by
mark.ziemann1.2k
Australia/Mebourne/Geelong/Deakin
mark.ziemann1.2k wrote:

Yes, I think it is safe to convert Ensembl IDs to Entrez. Ensembl is more comprehensive especially WRT ncRNA genes. You will lose some of these, but those genes are also less likely to have annotations in the gene set databases you are using. The limitation of GSEA techniques is that many genes have not yet been completely functionally annotated. You will still need to browse the top DEGs (especially ncRNAs) to be able to interpret the results as a whole.

ADD COMMENTlink written 7 weeks ago by mark.ziemann1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1005 users visited in the last hour