I am using my 454 data for OTU analysis in mothur. And I am confused after transform my sff to a fasta file. Sequencing information, platform 454 FLX, flow pattern TACG, barcode (AAAAAAAC) removed by sequencing center, primer: GAGTTTGATCNTGGCTCAG.
However, I have trouble to understand the sequence section (from 5th base to 12nd base). The primer started from 13 base. I attached the output fasta format from different toolkit.
sff_extract (from seq_crumbs toolkit) with clipping:
sff_extract (from seq_crumbs toolkit) without clipping:
mothur output after denoise:
Does anyone can help to understand the sequence agagcgaa part? Base on the sequencing center information, it does not belong to barcode. And how should I deal with it? For example, it there a way to remove this region in mothur? Thank you!