I have a set of 454 metagenomic (amplicon) reads from which I am attempting to extract a particular taxon group.
I have done the following:
- Select reference sequences for key taxa family (NCBI and some of our own).
- Create reference alignment using muscle.
- Align NGS query sequences to reference alignment using pynast, keeping those sequences which align closely and discarding non related taxa.
I now want to try and correct the extracted sequences based on the ORF of the functional gene which we are using (rbcl). I can see a homopolymer which is causing an insertion and others errors, but is there a way to correct the sequences?