what's the best Kmer should be set when assembling bacteria genome
13 months ago
zhangdengwei

Hi all,

I am new to bacterial genome assembly. Now I'd like to assemble a batch of single bacteria genome using Spades or IDBA-UD or Megahit. So I have no idea about which is better and how to set the best Kmer for assembly. Any suggestions would be greatly appreciated!

There's no single best option as it depends on the bacterium.

You might be interested in using something like shovil (a wrapper/front end around spades) which will try several kmer options and assess the best.

Assembly is an art, usually the default works good. Anyway I would use an established pipeline like https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007134 or https://github.com/nf-core/bacass

I entirely agree with you that assembly is an art. I tried several assemblers with different parameters, I would get different results. For example, I could obtain longer N50 with unicycler than with spades, but it came at a cost of shorter total assembly length. In general, it seems spades outperform others, at least according to my experience.

In my experience too.