Question: Gff3 Coordinate: Find Stop Codon On - Strand
2
gravatar for Rvosa
5.7 years ago by
Rvosa570
Leiden, the Netherlands
Rvosa570 wrote:

Given the following GFF3, where is the stop codon supposed to be:

scaffold1.1     maker   gene    247127  258737  .       -       .       ID=...
scaffold1.1     maker   CDS     258659  258737  .       -       1       ID=...
scaffold1.1     maker   CDS     254856  254976  .       -       2       ID=...
scaffold1.1     maker   CDS     251358  251395  .       -       1       ID=...
scaffold1.1     maker   CDS     250084  250198  .       -       2       ID=...
scaffold1.1     maker   CDS     248687  248760  .       -       1       ID=...
scaffold1.1     maker   CDS     247127  247239  .       -       0       ID=...

My reasoning so far has been:

  • the last CDS is the one at 247127..247239 on the minus strand
  • the because we are reading from right to left, the stop codon is at 247127..247130
  • also because we are on the minus strand, we need to reverse complement 247127..247130
  • the coordinates are 1-based, so I need to subtract 1 for each coordinate for any language that has 0-based indexes

Here's my confusion:

  • at 247127..247130 the sequence is GAT, so it's a reverse (but not complemented) stop codon. Is that right?
  • am I supposed to do something with the phase values?
gff3 codon coordinates strand • 2.1k views
ADD COMMENTlink modified 5.7 years ago by Istvan Albert ♦♦ 81k • written 5.7 years ago by Rvosa570

Isn't the sequence denoted by 247127..247130 of length 4, not 3?

ADD REPLYlink written 5.7 years ago by hlapp0

Indeed it is, apologies. See how these coordinates are driving me crazy? Harumph. I meant to say 247127.. 247129

ADD REPLYlink written 5.7 years ago by Rvosa570
0
gravatar for Istvan Albert
5.7 years ago by
Istvan Albert ♦♦ 81k
University Park, USA
Istvan Albert ♦♦ 81k wrote:

Your reasoning is correct and the last codon should be the stop codon if the sequence is reverse complemented.

Also note that the last three bases will be 247127, 28 and 29 and you should not include 30!

The phase indicates how many bases of the current CDS will complete the codon that started in the previous CDS. It does not affect the stop codon.

ADD COMMENTlink written 5.7 years ago by Istvan Albert ♦♦ 81k

Thank you very much for your reply, this is the first time where the 'phase' thing is starting to make sense. So is it then the case that, if we have only two CDSs in the same gene, then the phase of cds2 is going to be length(cds1) % 3?

ETA: if that's how it works then I can also see that the phase can't affect the stop codon, because for the stop codon we're just counting "backwards" from the last position.

ADD REPLYlink modified 5.7 years ago • written 5.7 years ago by Rvosa570

yes, where % means the remainder after division

ADD REPLYlink written 5.7 years ago by Istvan Albert ♦♦ 81k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1482 users visited in the last hour