Using ensembl biomart retrieving orthologous codin sequence
2
0
Entering edit mode
10.0 years ago
ishengomae ▴ 110

I have list of transcript IDs for the cow coding sequences and I want to use these IDs to retrieve orthologous sequences from other ensembl species. One such ID is ENSBTAT00000000080.

With the ensembl BioMart the procedure seemed intuitive to me....
1. Ensembl 75 as database
2. Dataset: Bos taurus
3. Filters: Gene (checked the Id list box and placed list of Ids); Multi-species comparison (checked the Homolog filter box and selected Orthologous genes (e.g Dog Orthologous genes)
4. Attributes: ticked the Sequence button at the top.
5. Click to get results.

....until the results seems not the correct ones! I am getting 100% identical sequences for whatever organism i want orthologs. For example, ENSBTAT00000000080 returns the same sequences in dog as in duck. Something is not right. What step I am missing?

gene • 3.3k views
ADD COMMENT
0
Entering edit mode
10.0 years ago

The sequence is coming from the database you're querying, so it'll always be the same. You need to get a list of orthologous gene/transcript IDs and then query their sequence in the database for that species. Perhaps you can do that in biomart online, but if not it's doable in the biomaRt package in R by just programmatically doing those steps.

ADD COMMENT
0
Entering edit mode

Thanks very much.

ADD REPLY
0
Entering edit mode
10.0 years ago
Vitis ★ 2.5k

Take a look at ensembl compara perl API, with which you could query orthologous genes (predicted by ensembl using MCL and gene trees) in batches.

ADD COMMENT

Login before adding your answer.

Traffic: 1769 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6