Question: Gaps in Protein coding Sequenzes allowed?
0
gravatar for dieter
16 months ago by
dieter0
dieter0 wrote:

Hi to all, I have a short question: After aligning protein coding nucleotide sequences I sometimes find gaps within a codon. Are these gaps always sequencing errors, or can a protein coding nucleotide sequence actually have gaps in the codons? All the best Dieter

msa protein alignment gaps • 667 views
ADD COMMENTlink modified 16 months ago • written 16 months ago by dieter0
1

When aligning two sequences, you're trying to maximize some measure of similarity between them. If you allow them, gaps can sometimes improve the quality of the alignment. How to interpret these gaps depends on the context. For example, when comparing sequences from different species, you may attribute the gaps to evolution.
I am not sure what you mean by "can a protein coding nucleotide sequence actually have gaps in the codons". Nucleic acids don't have physical gaps if that's what you mean.

ADD REPLYlink written 16 months ago by Jean-Karim Heriche18k

Hello, Thank you.

When aligning two sequences, you're trying to maximize some measure of similarity between them. If you allow them, gaps can sometimes improve the quality of the alignment. How to interpret these gaps depends on the context. For example, when comparing sequences from different species, you may attribute the gaps to evolution.

Yes, I knew that.

I am not sure what you mean by "can a protein coding nucleotide sequence actually have gaps in the codons". Nucleic acids don't have physical gaps if that's what you mean.

OK, I think I need to explain more detailed. A protein coding nucleotide sequence (DNA) is a sequence, which can be transcripted into a RNA, and this RNA can be translated into a protein. Such nucleotide sequence always consist of "codons" - a codon is a set of three nucleotides. You can read more about it here: https://en.wikipedia.org/wiki/Coding_region

Nucleic acids don't have physical gaps if that's what you mean.

No, that's not what I meant --> better explaination of what I menat follows below (sorry, I'm a very beginner)

All the best Dieter

ADD REPLYlink modified 16 months ago • written 16 months ago by dieter0

Did you discover introns?

I'm pretty sure that Jean-Karim Heriche knows about codons, dna and transcription. I'm less sure about your biological understanding, or you need to explain it again because it doesn't make sense to me.

ADD REPLYlink modified 16 months ago • written 16 months ago by WouterDeCoster37k

Hi, Thank you,

Did you discover introns?

No, Introns need to be excludet from a protein coding region. You only use the exons.

I'm pretty sure that Jean-Karim Heriche knows about codons, dna and transcription.

Ah, OK - I'm very sorry for explaining it. Wait - I will prepare a picture to explain what I mean. All the best Dieter

ADD REPLYlink modified 16 months ago • written 16 months ago by dieter0

Hi, Ok, I hope this explains my question: Here is a typical "CDS-alignment". These are exons only, already "cleaned": LINK In the middle you see such a single gap, which I sometimes find. My question is: Is this a real gap in that sequence, which really belongs to the sequence of that species, or is it just an error which happened while sequencing? All the best and thanks for your answers. All the best Dieter

ADD REPLYlink modified 16 months ago • written 16 months ago by dieter0

It depends on the context but a single nucleotide gap in a column otherwise very conserved and in a sequence that's otherwise almost identical to the others could indeed suggest a sequencing error.

ADD REPLYlink written 16 months ago by Jean-Karim Heriche18k
0
gravatar for dieter
16 months ago by
dieter0
dieter0 wrote:

Thank you very much Jean-Karim, You wrote it depends to the context - that means some protein coding sequences can contain singel gaps in codons and some not. Right? So, how can I find out if one special region can contain gaps or not? In many studys I have the beta-tub gene (Exon 5 to 6) and the ef-1alpha (Domain I-III) - would these exons allow single gaps? (These are only 2 examples - the main question is as above). All the best Dieter

ADD COMMENTlink written 16 months ago by dieter0

Please use the 'add reply' button to reply to a comment. This keeps the discussion organized.
It's possible that a coding sequence is missing nucleotides compared to related sequences, e.g. pseudogenes. However a single mutation causing a frameshift or an inactive protein in a highly conserved sequence looks like an unlikely event although the sequence could be coming from a recent duplication. If the context (biological or otherwise) doesn't offer any indication to the contrary then I wouldn't rule out a sequencing error.

ADD REPLYlink written 16 months ago by Jean-Karim Heriche18k

Thank you Jean-Karim, That answers my questin. All the best Dieter

ADD REPLYlink written 15 months ago by dieter0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1155 users visited in the last hour