Question: Jellyfish for transcriptome assembly
gravatar for SJ Basu
4.2 years ago by
SJ Basu30
SJ Basu30 wrote:


I have 2X150 reads of plant transcriptome and would like to assemble it using oases/velvet pipeline but I need to provide a kmer length for which I was using jellyfish. Now my question is how do I estimate a "appropriate" value for -m option in jellyfish count ??

PS: I used -m 21 to estimate kmer size for 2X250 genomic data of a bacteria and used it to assemble in velvet, it worked wonder but is not working in this case.

ADD COMMENTlink modified 4.2 years ago by Brian Bushnell17k • written 4.2 years ago by SJ Basu30


KmerGenie estimates the best k-mer length for genome de novo assembly. Given a set of reads, KmerGenie first computes the k-mer abundance histogram for many values of k. Then, for each value of k, it predicts the number of distinct genomic k-mers in the dataset, and returns the k-mer length which maximizes this number. Experiments show that KmerGenie's choices lead to assemblies that are close to the best possible over all k-mer lengths. KmerGenie predictions can be applied to single-k genome assemblers (e.g. Velvet, SOAPdenovo 2, ABySS, Minia). However, multi-k genome assemblers (e.g. SPAdes, IDBA) generally perform better with default parameters (using multiple k values), rather than the single best k predicted by KmerGenie.

ADD REPLYlink written 4.2 years ago by Medhat8.7k
gravatar for Brian Bushnell
4.2 years ago by
Walnut Creek, USA
Brian Bushnell17k wrote:

For 2x150bp, depending on your coverage, I suggest you try a few values around K=60 to 100 and see which seems to give the best assembly. Methods of estimating the best kmer length for genomes do not work well on transcriptomes due to the highly variable coverage.

ADD COMMENTlink written 4.2 years ago by Brian Bushnell17k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 754 users visited in the last hour