Question: Ensembl rest getting exon only Sequence.
gravatar for Ali.B
3 months ago by
Ali.B0 wrote:

Hello, First of please forgive me I have little knowledge in the subject, I'm coming from a computer science only background.

I'm building a webservice around ensembl rest api, so far all is good until I needed to get an exon only sequence.

What I'm trying to do is giving coordinates for example: human:10:101654703-101659823:-1 is 1- getting the sequence from the ensembl rest api. [easy enough] 2- getting overlapping exons that are protein coding in that region. example from the api 3- using the start and end of exons to get the whole overlapping sequence.

Now here are the problem I'm facing: 1- I believe there are different sources for exons(ensembl, ensembl_havana, havana). Which should I use and how? atm I'm prioritizing ensembl_havana and using that only, but I believe that is incorrect since ensembl_havana means exons that are agreed upon by both teams so I should add the rest of the exons reported by one of the teams to that ?

2- What's an Exon rank? didn't find information about that.

3- Given a negative strand and positive strand exons what to do and vice versa?

4- What's the Exon version ?

I apologise again for the amount of questions, but I've been struggling for a week with this, I'm getting valid results and more invalid ones.

Thank you.

exon sequence ensembl • 149 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by Ali.B0
gravatar for swbarnes2
3 months ago by
United States
swbarnes23.9k wrote:

Exon rank is the # that the exon is in that transcript. First exon is rank 1, second is rank 2, etc.

"+" and "-" tell you of the transcript runs forward or reverse on the genome. If you are getting your sequence from the genome coordinates, you might need to rev comp it to get the sequence in the right orientation with regard to the transcript. If you are asking for sequence by transcript or gene or exon ID, it should be in the right orientation for that context.

Version tells you the annotation version you are looking at.

ADD COMMENTlink written 3 months ago by swbarnes23.9k

Thank you for the explanation, just to make sure I understand the rank of an exon, it doesn't matter if rank 1 doesn't have the lowest start index compared to the rest, when calculating the exon sequence it should start with the sequence of exon with rank 1?

ADD REPLYlink written 3 months ago by Ali.B0

If the gene runs backwards, exon 1 should have the highest genomic position. But regardless of the direction of the gene, exon #1 is first of its transcript (it might not be first in another transcript of the same gene). If you are pulling it out by name or exon ID, it should be in the "right" orientation no matter what.

ADD REPLYlink written 3 months ago by swbarnes23.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1497 users visited in the last hour