I've got some sequences of RNA transcript made by 454 sequencing and assembled using sequences from a reference genome. I have the sequence of the transcript and the corresponding GenBank ID of the sequence used for its assembly. I don't have access to the assembly data.
How I can predict the protein coding sequence of the transcript ? Do I need to align them to the reference sequence first? Is there any particular protocol that people follow? Are there any software tools for doing this?
In all probability the sequences contain some frame shift errors. So any remedial method is much appreciated.