Question: Genome size estimation using jellyfish
gravatar for karthic
2.9 years ago by
karthic100 wrote:


I am facing a problem while estimating genome size using jellyfish. We have illumina reads for a shrimp and have done kmer analyses using kmers of 17 upto 32. All the histos when observed have dual peaks but when compared, the second peak does not change according to kmer size. So we considered the second peak as homozygous peak and took the peak height as coverage, calculated the genome size. But it grossly underestimates the genome size when compared to the estimation done using flow cytometry.

So now we are confused as to whether we should completely omit the first peak at all. Please suggest an approach or formula to estimate near accurately.

Thank you.

ADD COMMENTlink written 2.9 years ago by karthic100

For the genome size estimation using K-mers, you have to consider all the distinct kmers. So in your case you should also consider the heterozygous peak. The heterozygous peak adds up a large part because it gives you exactly double the amount of distinct kmers with exactly half the coverage. I think you should add the hetero and homozygous peaks and then use the collective coverage in your genome size calculation. For instance: if your homozygous peak is at 100 and your heterozygous peak is at 50, your collective coverage will be 150. Also, this scenario fits well with diploid species.

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by JstRoRR60

Thanks for your JstRoRR. If I simply add both the coverages and use it for calculation, then my genome size will further go down. somewhere is read we have to take mean of both coverages. But what i need is a solid calculation, am not sure what to follow.

ADD REPLYlink written 2.9 years ago by karthic100

Are you calculating it manually? have you tried this online tool It gives you a model based genome unique length estimation.

ADD REPLYlink written 2.9 years ago by JstRoRR60

Hi JstRoRR,

Thanks a lot. I nvr knew about the online tool, we were doing manually so far. We have histos from 17mer upto 32mers. We will run them in the online tool and see what we get.

Thanks again. kk

ADD REPLYlink written 2.9 years ago by karthic100
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1644 users visited in the last hour