Heterozygosity in k-mer histogram
Entering edit mode
3.2 years ago
jm440 ▴ 10


I ran GenomeScope to try to estimate the level of heterozygosity in my genome, however, the output plot looks quite strange and most alarming, is the incredibly large estimated genome size (I am expecting a genome of ~5MB and getting 120MB), so I am not sure if I can trust the reported heterozygosity value. Has anyone ever experienced this before and can offer any suggestions on what could be happening here? Some more information: I have 200bp paired end reads and pretty high coverage.

Link to plot: here

All the code I used to get the plot:

jellyfish count -C -m 21 -s 5000000000 -t 8 R1.fastq -o reads.jf
jellyfish histo -t 8 reads.jf > reads.histo
Rscript genomescope.R reads.histo 21 200 results_out 700

More of the GenomeScope output:

len:120MB uniq:0.43% het:2.97% kcov:13.3 err:0.143% dup:0.39% k:21

Thank you

genomescope k-mer heterozygosity • 1.4k views
Entering edit mode

I have never used GenomeScope or jellyfish directly, but you say your expected genome size is ~5M bp, but you apparently entered 5G bp (nine zeros instead of 6). You could also compare your results to KAT (https://kat.readthedocs.io/en/latest/).


Login before adding your answer.

Traffic: 2407 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6