Can't convert dog ensembl IDs into gene names
3
0
Entering edit mode
5 months ago
fifty_fifty ▴ 30

I usually use biomaRt to convert gene ids to symbols. However, this time the ensembl IDs I have (dog) do not match the ensembl ids of biomart dataset "clfamiliaris_gene_ensembl".

I also tried to use the ensembl web portal, the dog dataset is called ROS_Cfam_1.0 there. Looks like my genes do not match the genes from their dataset. My genes look like this:

"ENSCAFG00000045440" "ENSCAFG00000000001" "ENSCAFG00000000002" "ENSCAFG00000041462" "ENSCAFG00000000005"

. Here is my biomaRt code:

ensembl <- useMart("ensembl")
ensembl <- useDataset("clfamiliaris_gene_ensembl",mart=ensembl)
gene_id <- getBM(attributes = c('ensembl_gene_id', 'external_gene_name'),
                 values = rownames(mydata),
                 filters = c('ensembl_gene_id'), mart = ensembl)
gene_id
[1] ensembl_gene_id    external_gene_name
<0 rows> (or 0-length row.names)

it doesn't find my values. Should I use a different dataset for dogs?

RNA-seq dog ensembl biomart • 848 views
ADD COMMENT
0
Entering edit mode

Looks like these gene ID's are from a different dog breed, namely boxer. See if clfamiliarisboxer_gene_ensembl works?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode
ADD REPLY
1
Entering edit mode
5 months ago
Ben_Ensembl ★ 2.0k

You are right, Genomax. These IDs are from the Boxer dog genome assembly: https://www.ensembl.org/Canis_lupus_familiaris/Info/Strains?db=core

However, BioMart is not available for dog breeds (as well as other species and strains): https://www.ensembl.info/2021/01/20/important-changes-of-data-availability-in-ensembl-gene-trees-and-biomart/

However, you can use the POST lookup/id REST API endpoint to retrieve the gene symbol for a list of gene IDs from any species: http://rest.ensembl.org/documentation/info/lookup_post

ADD COMMENT
0
Entering edit mode
5 months ago
Shred ▴ 620

I've posted on Gist this script to download a csv table made by Ensembl gene id -> symbol using any of the supported species in Ensembl. It's written in Python3

ADD COMMENT
0
Entering edit mode

Can you clarify what do you mean by "supported species"? Also include a usage statement for the script in the gist to help novice users.

ADD REPLY
0
Entering edit mode
5 months ago
tamerg ▴ 100

You can achive this conversion with biobtreeR simply use all these dog species names along with human which takes few minutes to build the local database

bbBuildCustomDB(rawArgs = "-s homo_sapiens,canis_lupus_dingo,canis_lupus_familiaris,canis_lupus_familiarisbasenji,canis_lupus_familiarisgreatdane build")

And map your ensembl IDs to human with a query(you can check the doc for examples) and then you will reach the gene symbols.

ADD COMMENT

Login before adding your answer.

Traffic: 2596 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6