Question

Guidelines to choose K-mer size for De bruijn graph based assembly (2nd generation sequencing reads)?

0

Entering edit mode

7.8 years ago

saranpons3 ▴ 70

Hello members, I would like to know that is there any guidelines to choose K-mer size for Debruijn graph based assembly (2nd generation sequencing reads). I have F.vesca data set in which total number of reads is 12803137 and on an average length of each read is 353 bp. So, I would like to know that what is the best kmer size for assembling these many reads of F.Vesca. Can i try with any k-value above 100 in this case? Thanks.

k-mer de bruijn graph • 4.0k views

ADD COMMENT • link updated 7.8 years ago by h.mon 35k • written 7.8 years ago by saranpons3 ▴ 70

0

Entering edit mode

Which assembler are you considering? Has your data been preprocessed somehow?

ADD REPLY • link 7.8 years ago by h.mon 35k

0

Entering edit mode

Dear h.mon, Dataset is not preprocessed already. What should be the k-mer size depends on the assembler? If so, as of now I have velvet installed in my computer.So, i will use velvet assembler.

ADD REPLY • link 7.8 years ago by saranpons3 ▴ 70

0

Entering edit mode

The k-mer size is obviously limited by your read lengths, i.e., you cannot have a k-mer that's longer than your read length.

The k-mer size is somewhat independent of the assembler and more to do with your read-lengths, I would imagine. People typically think around the 30-40 range, but with higher k-mers you can achieve a more comprehensive assembled genome (at a computational expense).

ADD REPLY • link 7.8 years ago by Kevin Blighe 89k

score 1 · Answer 1 · 2017-09-19

1

Entering edit mode

7.8 years ago

h.mon 35k

As you plan to use Velvet: under the folder contrib/, there is the VelvetOptimizer script, it will run Velvet with a range of kmers and select the best assembly for you.

To use Velvet, you have to preprocess your data: remove adapter and low quality regions, and possibly error-correct the reads.

ADD COMMENT • link 7.8 years ago by h.mon 35k

0

Entering edit mode

I was also originally going to mention Velvet and VelvetOptimiser.

VelvetOptimiser allows you to test a range of k-mer sizes and it then picks the best based on how you define the following command-line parameters: --optFuncKmer and --optFuncCov

ADD REPLY • link 7.8 years ago by Kevin Blighe 89k