Hello members, I would like to know that is there any guidelines to choose K-mer size for Debruijn graph based assembly (2nd generation sequencing reads). I have F.vesca data set in which total number of reads is 12803137 and on an average length of each read is 353 bp. So, I would like to know that what is the best kmer size for assembling these many reads of F.Vesca. Can i try with any k-value above 100 in this case? Thanks.
As you plan to use Velvet: under the folder contrib/, there is the VelvetOptimizer script, it will run Velvet with a range of kmers and select the best assembly for you.
To use Velvet, you have to preprocess your data: remove adapter and low quality regions, and possibly error-correct the reads.