I want to compare a large list of fragments of gene sequences between two species using dN/dS. The problem is that some of these gene fragments might start with the last nucleotide of a codon rather than a whole codon and the program I am using to compute dN/dS will of course count the reading frame from the gene fragment's first nucleotide even if this is wrong.
If I had a small number of gene fragments, I could check the fragment frame by hand by downloading the entire gene sequence, finding the start of my fragment in the entire gene, and checking to see if the fragment started with a piece of the prior codon. Then, if the frame was incorrect, I would remove the starting nucleotide from the sequence so that I then had a full codon.
Does anyone know of a way that this could be implemented on a large scale to check ~6,000 gene fragments?