Hi all,
Long time lurker, first time poster so please excuse me if I don't provide enough info.
My objective is to take Illumina sequencing data, and process it to obtain annotated Genbank files for our engineered E. coli strains. This is for the purposes of designing primers etc to further engineer our strains.
We mostly use breseq, as the output is straightforward for beginners in bioinformatics like myself. It can get a bit tricky though with sections of the genome that don't map to our reference (selectable markers etc).
How would the bioinformatics veterans of Biostars take the output of breseq, and align the unmatched reads, then annotate the subsequent genome?
The output files of breseq are:
output.vcf
reference.bam
reference.bam.bai
reference.fasta
reference.fasta.fai
reference.gff3
summary.json
R1_001.unmatched.fastq
R2_001.unmatched.fastq
annotated.gd
output.gd