How to extract start codon position when start codon annotation does not exists

0

Entering edit mode

3.4 years ago

Damianos P. Melidis ▴ 40

Dear all,

I have a module to get the length of the transcript sequence up to start codon, from an transcript id using the pyensembl library. Please see the sample:

len_first_exons_up_start_codon = transcript.start_codon_spliced_offsets[0]
var_start = var_start + len_first_exons_up_start_codon
var_end = var_end + len_first_exons_up_start_codon

The code is intended for human genome and the reference is GRCh37. Everything works, but for ensembl id= ENST00000327122 I get ValueError from pyensembl, which compressed looks like:

ValueError: Transcript ENST00000327122 does not contain feature start_codon

Looking on ensembl.org for ENST00000327122, I can see that it contains only two exons, however still I don't know how to get the start codon position using pyensembl for this case of transcript.

Grateful to your ideas to solve this!

pyensembl start_codon coding_sequence GRCh37 • 679 views

ADD COMMENT • link 3.4 years ago by Damianos P. Melidis ▴ 40

Login before adding your answer.