8.4 years ago by
Or use getORF from the EMBOSS package, available as an executable, web-service, or website.
There is nothing fancy about ORF finding, they don't need to be predicted, genes are predicted, they are simply found, as either any sequence that does not contain a stop codon and ends with a stop codon, or alternatively any sequence between a start and stop codon (in frame). The orf finding therefore automatically takes all 6 frames into account. getOrf supports both modes. Make sure to select the appropriate genetic code.
Edit: one simple way to calculate the frame in pseudo code given start and stop:
if ( + strand, use the info in the header)
# start < stop would also work except for circular genome with orf spanning origin
frame := (stop %modulo% 3) + 1
frame := - (stop %modulo% 3)
# actually with minus strand I am not 100% sure if that is the best way