I'm new to bioinformatics and have just been assigned to a genomic dna sequencing + genome assembly project and would appreciate your advice for some basic questions I have!
We are conducting a population survey of ~90 strains of Bacillus Subtilis (genome size = 4 MB). We would like to do full genome sequencing on each of these strains. We already have many reference genomes sequenced and will use those as anchors.
We will purify DNA from each strain and will have 90 individual DNA samples. We want to send these DNA samples out to a university/company to be sequenced. Using that data, we intend to assemble/resequence the genome of each of these strains ourselves using Velvet, SPAdes, or an alternative genome assembler.
Right now, I have the responsibility of choosing where our DNA samples get sent out to: what NGS platform to use, and what run specifications to use for our project.
My issue right now is that I'm not sure which next generation sequencing platform is suitable for our project if we intend to do genome assembly as our goal. How do I pick between Illumina (Miseq, Hiseq, etc) vs. PacBio?
I also am unsure of what depth of coverage would be acceptable for our purposes; I've been told 10X should be good enough but that seems low to me - perhaps 20X would suffice?
Finally, I'm not certain what read length would be appropriate - do you know if paired-end reads of 2x150 bp or 2x250 bp would be good?
I'm from a different field so I have a lot to learn - I'd appreciate any pointers you have. Thanks!