I recently received MiSeq 250bp PE sequencing results from a small genome (30Mb-50Mb) eukaryote, that is highly heterozygous. After doing some assemblies I realized the PE actually overlap and that the fragment size is around 300bp. What a waste, lots of redundant sequencing. So most of the reads can be merged, and those that cannot, are actually of very poor quality, I tried mapping them back and they are full of chimeras, so not a bad idea to discard them.
I also received some HiSeq mate pairs, big insert size of about 2500bp. After cleaning these up, removing adapters and contamination however, the nucleotide coverage of these is very poor, around 5x and the reads are 50bp in length. The MiSeq merged reads have a decent nucleotide coverage of around 300x to 450x.
Not sure any DBG assembler is going to be happy with inconsistent read lengths of the MiSeq merged library and the low coverage of the mate pairs. Not even sure what kmer size I would even pick.
That's why i was thinking Newbler. I heard it is really good at scaffolding 454 jump libraries, I am wondering how it would perform on illumina jumps. I also think it will like the merged MiSeq, since they are much better in quality than 454 reads.
Anyone have any idea how I could feed illumina mate pairs? The tutorial I found (1) shows how reads can be assembled, but does show how to use illumina mate pairs, or specify insert length. Seems like newbler just guesses that?
Would be happy to hear any input, Adrian