Doubt on annotation of ensembl ids in biomart using R
9 weeks ago
Josh • 0

Hi All :)

I have a large list of unique ENSEMBL IDs (48,432) representing mouse genes, my goal only is to get attributes like gene name and ncbi id to build a dataset with expression data. However when using the biomaRt package in R, I only get 44,980 results, and the 3,000+ ENSEMBLE IDs do not have the information I need.

Here are some ENSEMBLE IDs that I noticed on the ENSEMBL page that appear as deprecated and no longer belong to the ENSEMBL database:

ENSMUSG00000000325 \ ENSMUSG000000000004613 \ ENSMUSG00000011052 \ ENSMUSG00000021745 \ ENSMUSG00000021867

Is there any explanation why these IDs are not associated with others in the current version of ENSEMBL?

Thank you for your attention and comments :)

You should set your biomaRt version to the ensembl version used when generating the original ID list, or rerun your analysis with a newer assembly/annotation version. There are numerous reasons why IDs are updated as assemblies and annotation improve, so mismatches between versions are expected and normal.

Thanks for your answer,

I will look for the release date of the original dataset to relate it to the version of ensembl of that date, and I will try again the annotation.


