Question: How to decide Kmer size for RNA seq analysis using Trinity
1
gravatar for raya.girish
4.0 years ago by
raya.girish20
raya.girish20 wrote:

Hello Everyone!!!! I am totally new to RNA seq denovo analysis. My question is for deciding kmer size what length should be preferable i mean whether to use shorter kmer size or longer ? What will be its demerit and merit for using shorter or longer kmer size?

thanks

rna-seq kmer trinity • 2.9k views
ADD COMMENTlink modified 4.0 years ago by iraun3.7k • written 4.0 years ago by raya.girish20
1
gravatar for iraun
4.0 years ago by
iraun3.7k
Norway
iraun3.7k wrote:

The "optimal" kmer size length topic have been widely discussed across several posts. I would recommend you to do a quick search to get in contact to de novo transcriptome assembly issues.

Speaking about the Kmer size, generally, a good choice is to set it between half to 2/3rd of the read length. A too small length will lead to high amount of short contigs, most of them partial length assemblies, while if you choose a longer size, will result in few long contigs.

Yo may need to perform several trial runs with different kmer lengths, and select the best one according to several statistics such as N50, total scaffold length... etc.

ADD COMMENTlink written 4.0 years ago by iraun3.7k

Thanks for the help iraun!!!!!!!!!!!!!! I agree , but some of the paper says shorter the kmer will result in number of ambiguities repeat . How is this possible i mean i need some basic answer to understand. Basic in the sence what exactly happens to contigs when i use kmer shorter or longer

ADD REPLYlink written 4.0 years ago by raya.girish20

As iarun pointed out that kmer size is 1/2 to 2/3 the length of the read. I would say why not try a dry test of running with 3-4 different kmer intervals and see it yourself. It is important to do what iarun said. You can assemble the with Trinity and then use the same kmer intervals with Salmon/Sailfish to index the assembly and see how the quantification/transcript abundance are changing are running the quantification. I recommend salmon since they are quite fast and light-weight bases and one can do without alignment mode so you can do this assessment pretty fast.

ADD REPLYlink written 4.0 years ago by ivivek_ngs4.9k

okay thanks vchris_ngs

ADD REPLYlink written 4.0 years ago by raya.girish20

I think the kmer size for jellyfish is set at 25 for Jellyfish during Trinity. I don't think you can change this, unless you can add the parameter to new versions of the program.

ADD REPLYlink written 4.0 years ago by st.ph.n2.5k

Now you can, but only to a maximum of k=32 (they are limited to this number because of memory saving, as I read elsewhere).

ADD REPLYlink written 3.1 years ago by Andrés Ribone0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 884 users visited in the last hour