Question

Forum:Can exons from different reading frames coexist in a peptide sequence?

0

Entering edit mode

2.6 years ago

ricfoz ▴ 100

Hello everyone

I am building a program in order to predict peptide sequences from DNA data. I am still in an early phase and making a test i ran into the next problem:

1.- I Downloaded from the NCBI a gene sequence along with its 'computationaly predicted' correspondent protein sequence.

2.- Then in my program, in the very first phase, I translated the DNA fasta file into its correspondent 6 reading possibe reading frames so as to explore which one is open. I identified the longest ORF in the first forward RF, and found this longest peptide to be present in the protein sequence I originally downloaded. Good.

3.- The first few aminoacid sequence from this protein sequence downloaded from the NCBI is not present in the translated first fwd ORF which is the open one, but it is actually found in the third fwd ORF. Therefore, I am facing a protein sequence from NCBI which is the result of merging exons from two different reading frames.

I know that this makes no biological sense, as far as I know, so my question/discussion point, is if it is biologically possible for a cell to come up with such peptide sequence, or should I just regard this part of the predicted protein as wrong.

Thank you for your attention and Best regards.

prediction splicing codons protein exons • 2.2k views

ADD COMMENT • link updated 2.6 years ago by swbarnes2 14k • written 2.6 years ago by ricfoz ▴ 100

0

Entering edit mode

No biological sense? Do you understand what introns are?

ADD REPLY • link 2.6 years ago by swbarnes2 14k

0

Entering edit mode

Introns are exiced from mRNA that was transcribed from one reading frame... I am asking if anyone knows about the possibility to have a protein resulting trom the translation of TWO different reading frames, which would of course imply the involvement of two mRNA molecules stemming from reading the same DNA strand in two different frames. I would appreciate any constructive discussion.

Kind regards community

ADD REPLY • link 2.6 years ago by ricfoz ▴ 100

0

Entering edit mode

What you described happening makes absolutely perfect biological sense if you are translating gene sequence with all the introns still in there. There is no reason whatsoever to expect introns to be in the same reading frame in unspliced transcript. Indeed, it is obvious when looking at intron lengths.

ADD REPLY • link 2.6 years ago by swbarnes2 14k

0

Entering edit mode

Yes, what you say in deed makes sense, but what I searched for are exons. I took two sequences from the predicted protein (they are present in the protein, therefore are exons), and searched for them in my translated gene. The first exon is present in one reading frame, and the second in a different reading frame

ADD REPLY • link 2.6 years ago by ricfoz ▴ 100

0

Entering edit mode

So you are expecting all the exons in a whole gene to be in the same reading frame? And you think that because you don't see this the online gene annotations are wrong?

ADD REPLY • link 2.6 years ago by swbarnes2 14k

0

Entering edit mode

I am trying to comprehend the nature of what I have in hands, on early stages as to continue coding, in a way that makes biological sense. What I posted. Is an initial guess

ADD REPLY • link 2.6 years ago by ricfoz ▴ 100

0

Entering edit mode

I fear you will have to seriously buff up your knowledge on this topic before embarking on implementing/programming things.

Start with the "central dogma of biology" , basic stuff but without it you're lost anyway. (the whole issue you describe here is actually completely artificial as for instance exons and peptide do not exist at the same level/time ) . Yes, we make an approximation to be able to comprehend and mainly visualize those things but biological speaking they have no meaning (eg. reading from on DNA level has no meaning, biological speaking)

ADD REPLY • link 2.6 years ago by lieven.sterck 15k

0

Entering edit mode

Thank you for the discussion points to both lieven.sterck, and swbarnes2, I am reviewing all subjects related to central dogma and splicing. It is useful to know that my first approach and line of thinking is not right. as you mention, going from different expression levels in code is tricky, my point was to get this discussion, which of course will switch my approach.

Best regards

ADD REPLY • link 2.6 years ago by ricfoz ▴ 100

0

Entering edit mode

you're welcome. That's what we are here for. ;)

one other thing you might need to consider is the kingdom (of species) you work in. In prokaryotes your view is somewhat more accurate than for eukaryotes (your view is way off for those)

ADD REPLY • link 2.6 years ago by lieven.sterck 15k

0

Entering edit mode

No, I am working with eukaryotes, primates to be more specific. Actually I am trying to predict where are introns located. I got some annotated data, these are straight forward, since I got the coordinates where to splice, but I am making tests as to be able to translate a gene without the anotations. Cheers.

ADD REPLY • link 2.6 years ago by ricfoz ▴ 100

0

Entering edit mode

That is not a particularly easy problem. Indeed, there could be no such thing as alternate transcripts if splicing were 100% predictable from the sequence alone.

ADD REPLY • link 2.6 years ago by swbarnes2 14k