best kmer for genome size estimation
4
0
Entering edit mode
3.5 years ago
popayekid55 ▴ 80

Hi,

How do i select a best kmer for genome size estimation using jellyfish. I did using 31 and 41 both gave me a different result. Read chemistry is 150*2. are there any other tools, i am trying gce also.

genome kmer • 3.6k views
1
Entering edit mode
3.5 years ago
toralmanvar ▴ 910

Hello,

You can give a try to kmergenie.

Mostly it works for me.

1
Entering edit mode

I have tried that and best kmer from kmergenie was way too high. For this data it was 119. I was not sure this kmer to use in jellyfish for genome estimation

0
Entering edit mode

I don't understand when kmergenie along with best kmer already gives you estimated genome size then why you want to use jellyfish again for the same? Checking the histo.pdf file generated from kmergenie properly can give you required answer.

1
Entering edit mode
3.5 years ago
GenoMax 102k

You can use Jellyfish as described here. BBMap suite has kmercountexact.sh that can be used for this purpose.

0
Entering edit mode

Thank you for the response. I was following the 1st link for genome estimation. It was mentioned for eukaryote 17-31 would be fine in jellyfish. In the tutorial they chose 25. Still not understanding how to choose the kmer length for counting. kmercountexact.sh is taking kmer length of 31 as default.

Is 31 a std kmer length for any eukaryote genomes?

1
Entering edit mode
18 months ago
harish ▴ 330

I tend to use ntCard and then throw the histograms at genomescope; both of which can be run offline.

You just need to modify the files a little bit. Remove the F* lines and change the separator to space instead of tab.

There isn't any particular kmer that is golden. It will depend on your species and data generated, i.e repetitiveness, long vs short read etc.

0
Entering edit mode
18 months ago
andorjkiss ▴ 20