How to extract start codon position when start codon annotation does not exists
0
0
Entering edit mode
20 months ago

Dear all,

I have a module to get the length of the transcript sequence up to start codon, from an transcript id using the pyensembl library. Please see the sample:

len_first_exons_up_start_codon = transcript.start_codon_spliced_offsets[0]
var_start = var_start + len_first_exons_up_start_codon
var_end = var_end + len_first_exons_up_start_codon


The code is intended for human genome and the reference is GRCh37. Everything works, but for ensembl id= ENST00000327122 I get ValueError from pyensembl, which compressed looks like:

ValueError: Transcript ENST00000327122 does not contain feature start_codon


Looking on ensembl.org for ENST00000327122, I can see that it contains only two exons, however still I don't know how to get the start codon position using pyensembl for this case of transcript.

Grateful to your ideas to solve this!

pyensembl start_codon coding_sequence GRCh37 • 422 views