Mapping EntrezId to Ensembl IDs returns NA for pseudogenes and snoRNA
20 days ago

Hello everybody, I am pretty new to the bioinformatic world and I would really appreciate any advice regarding this issue (and how to properly look for help). So, I have a list of genes from an RNA-Seq experiment in EntrezID that I need to convert to Ensembl Id. I am using annotationDbi with both EnsDb.Hsapiens.v86 and but I get #NA values for pseudogenes and snoRNA whenever I run the code below. Is there a better way of doing this? By looking online it seems that it is a frequent issue, but it should be able to be solved as I checked several of the unmapped genes and they have Ensembl IDs assigned to them. Thanks in advance!

EnsDb2 <- AnnotationDbi::mapIds(EnsDb.Hsapiens.v86,
                                     keys = Data$gene_id,  
                                     column = "GENEID",
                                     keytype = "ENTREZID",

orgDb_mapID <- AnnotationDbi::mapIds(,
                                           keys = Data$gene_id,  
                                           column = "ENSEMBL",
                                           keytype = "ENTREZID")

Session info()

R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.14.0          AnnotationFilter_1.14.0   GenomicFeatures_1.42.3   
 [5] GenomicRanges_1.42.0      GenomeInfoDb_1.26.7       xlsx_0.6.5            
 [9] AnnotationDbi_1.52.0      IRanges_2.24.1            S4Vectors_0.28.1          Biobase_2.50.0           
[13] BiocGenerics_0.36.1      

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6                  lattice_0.20-41             prettyunits_1.1.1          
 [4] Rsamtools_2.6.0             xlsxjars_0.6.1              Biostrings_2.58.0          
 [7] assertthat_0.2.1            utf8_1.2.1                  BiocFileCache_1.14.0       
[10] R6_2.5.0                    RSQLite_2.2.6               httr_1.4.2                 
[13] pillar_1.6.0                zlibbioc_1.36.0             rlang_0.4.10               
[16] progress_1.2.2              lazyeval_0.2.2              curl_4.3                   
[19] rstudioapi_0.13             blob_1.2.1                  Matrix_1.3-2               
[22] BiocParallel_1.24.1         stringr_1.4.0               ProtGenerics_1.22.0        
[25] RCurl_1.98-1.3              bit_4.0.4                   biomaRt_2.46.3             
[28] DelayedArray_0.16.3         compiler_4.0.3              rtracklayer_1.50.0         
[31] pkgconfig_2.0.3             askpass_1.1                 openssl_1.4.3              
[34] tidyselect_1.1.0            SummarizedExperiment_1.20.0 tibble_3.1.1               
[37] GenomeInfoDbData_1.2.4      matrixStats_0.58.0          XML_3.99-0.6               
[40] fansi_0.4.2                 crayon_1.4.1                dplyr_1.0.5                
[43] dbplyr_2.1.1                GenomicAlignments_1.26.0    bitops_1.0-6               
[46] rappdirs_0.3.3              grid_4.0.3                  lifecycle_1.0.0            
[49] DBI_1.1.1                   magrittr_2.0.1              stringi_1.5.3              
[52] cachem_1.0.4                XVector_0.30.0              xml2_1.3.2                 
[55] ellipsis_0.3.1              generics_0.1.0              vctrs_0.3.7                
[58] tools_4.0.3                 bit64_4.0.5                 glue_1.4.2                 
[61] purrr_0.3.4                 hms_1.0.0                   MatrixGenerics_1.2.1       
[64] fastmap_1.1.0               BiocManager_1.30.12         memoise_2.0.0              
[67] rJava_0.9-13
Hi Dante,

In this setting I suggest you to use the biomaRt to solve your issue. Through the vignette you will find a nice explanation about the usage of the package. Make sure that you will use the lastest version of Ensembl annotation.

Best regards!

