I am not exactly new to sequence analysis, but I still feel very green. I have been to a few short courses on how to perform the usual analysis and I feel comfortable with that. Well to a point. I guess the correct way to say that is I no longer think I'm doing it wrong.
My concern is in the generation of the assemble from the reads. In all the readings that I have found and in the short courses, this step seems to be glossed over. I think I am making good assemblies, we had some of the data assembled from another lab and their draft was the same as ours, however it took them a day and it takes me a week.
I am currently working on WGS of dsRNA viruses (reo) with inhouse ion torrent data, my current workflow is to get the reads from the machine, filter out the low quality reads and then use Mira or DNAstar Ngen to de novo (high variability even with closely related strains) the reads into contigs. Then I take the contig fasta file and blast it against NCBI to figure out what contigs are what, place that information into a spreadsheet, and then find the overlaps from the blast data and make my genes.
Is this the correct way to do this kind of assembly? I know it works but it is pretty labor intensive, we are still sequencing more samples and the backlog is getting pretty deep. Any insight on how I could speed this process up?