Mapping EntrezId to Ensembl IDs returns NA for pseudogenes and snoRNA
0
0
Entering edit mode
20 days ago

Hello everybody, I am pretty new to the bioinformatic world and I would really appreciate any advice regarding this issue (and how to properly look for help). So, I have a list of genes from an RNA-Seq experiment in EntrezID that I need to convert to Ensembl Id. I am using annotationDbi with both EnsDb.Hsapiens.v86 and org.Hs.eg.db but I get #NA values for pseudogenes and snoRNA whenever I run the code below. Is there a better way of doing this? By looking online it seems that it is a frequent issue, but it should be able to be solved as I checked several of the unmapped genes and they have Ensembl IDs assigned to them. Thanks in advance!

EnsDb2 <- AnnotationDbi::mapIds(EnsDb.Hsapiens.v86,
                                     keys = Data$gene_id,  
                                     column = "GENEID",
                                     keytype = "ENTREZID",
                                     multiVals="first")

orgDb_mapID <- AnnotationDbi::mapIds(org.Hs.eg.db,
                                           keys = Data$gene_id,  
                                           column = "ENSEMBL",
                                           keytype = "ENTREZID")


Session info()

R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.14.0          AnnotationFilter_1.14.0   GenomicFeatures_1.42.3   
 [5] GenomicRanges_1.42.0      GenomeInfoDb_1.26.7       xlsx_0.6.5                org.Hs.eg.db_3.12.0      
 [9] AnnotationDbi_1.52.0      IRanges_2.24.1            S4Vectors_0.28.1          Biobase_2.50.0           
[13] BiocGenerics_0.36.1      

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6                  lattice_0.20-41             prettyunits_1.1.1          
 [4] Rsamtools_2.6.0             xlsxjars_0.6.1              Biostrings_2.58.0          
 [7] assertthat_0.2.1            utf8_1.2.1                  BiocFileCache_1.14.0       
[10] R6_2.5.0                    RSQLite_2.2.6               httr_1.4.2                 
[13] pillar_1.6.0                zlibbioc_1.36.0             rlang_0.4.10               
[16] progress_1.2.2              lazyeval_0.2.2              curl_4.3                   
[19] rstudioapi_0.13             blob_1.2.1                  Matrix_1.3-2               
[22] BiocParallel_1.24.1         stringr_1.4.0               ProtGenerics_1.22.0        
[25] RCurl_1.98-1.3              bit_4.0.4                   biomaRt_2.46.3             
[28] DelayedArray_0.16.3         compiler_4.0.3              rtracklayer_1.50.0         
[31] pkgconfig_2.0.3             askpass_1.1                 openssl_1.4.3              
[34] tidyselect_1.1.0            SummarizedExperiment_1.20.0 tibble_3.1.1               
[37] GenomeInfoDbData_1.2.4      matrixStats_0.58.0          XML_3.99-0.6               
[40] fansi_0.4.2                 crayon_1.4.1                dplyr_1.0.5                
[43] dbplyr_2.1.1                GenomicAlignments_1.26.0    bitops_1.0-6               
[46] rappdirs_0.3.3              grid_4.0.3                  lifecycle_1.0.0            
[49] DBI_1.1.1                   magrittr_2.0.1              stringi_1.5.3              
[52] cachem_1.0.4                XVector_0.30.0              xml2_1.3.2                 
[55] ellipsis_0.3.1              generics_0.1.0              vctrs_0.3.7                
[58] tools_4.0.3                 bit64_4.0.5                 glue_1.4.2                 
[61] purrr_0.3.4                 hms_1.0.0                   MatrixGenerics_1.2.1       
[64] fastmap_1.1.0               BiocManager_1.30.12         memoise_2.0.0              
[67] rJava_0.9-13
RNAseq AnnotationDbi • 208 views
ADD COMMENT
1
Entering edit mode

Hi Dante,

In this setting I suggest you to use the biomaRt to solve your issue. Through the vignette you will find a nice explanation about the usage of the package. Make sure that you will use the lastest version of Ensembl annotation.

Best regards!

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2262 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6