Question: best kmer for genome size estimation
0
gravatar for popayekid55
2.7 years ago by
popayekid5570
popayekid5570 wrote:

Hi,

How do i select a best kmer for genome size estimation using jellyfish. I did using 31 and 41 both gave me a different result. Read chemistry is 150*2. are there any other tools, i am trying gce also.

Thanks in advance

kmer genome • 2.8k views
ADD COMMENTlink modified 8 months ago by harish290 • written 2.7 years ago by popayekid5570
1
gravatar for toralmanvar
2.7 years ago by
toralmanvar840
toralmanvar840 wrote:

Hello,

You can give a try to kmergenie.

Mostly it works for me.

ADD COMMENTlink written 2.7 years ago by toralmanvar840
1

I have tried that and best kmer from kmergenie was way too high. For this data it was 119. I was not sure this kmer to use in jellyfish for genome estimation

ADD REPLYlink written 2.7 years ago by popayekid5570

I don't understand when kmergenie along with best kmer already gives you estimated genome size then why you want to use jellyfish again for the same? Checking the histo.pdf file generated from kmergenie properly can give you required answer.

ADD REPLYlink written 2.7 years ago by toralmanvar840
1
gravatar for genomax
2.7 years ago by
genomax87k
United States
genomax87k wrote:

You can use Jellyfish as described here. BBMap suite has kmercountexact.sh that can be used for this purpose.

ADD COMMENTlink written 2.7 years ago by genomax87k

Thank you for the response. I was following the 1st link for genome estimation. It was mentioned for eukaryote 17-31 would be fine in jellyfish. In the tutorial they chose 25. Still not understanding how to choose the kmer length for counting. kmercountexact.sh is taking kmer length of 31 as default.

Is 31 a std kmer length for any eukaryote genomes?

ADD REPLYlink written 2.7 years ago by popayekid5570
1
gravatar for harish
8 months ago by
harish290
harish290 wrote:

I tend to use ntCard and then throw the histograms at genomescope; both of which can be run offline.

You just need to modify the files a little bit. Remove the F* lines and change the separator to space instead of tab.

There isn't any particular kmer that is golden. It will depend on your species and data generated, i.e repetitiveness, long vs short read etc.

ADD COMMENTlink written 8 months ago by harish290
0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1397 users visited in the last hour