Question: Large input when using jellyfish
gravatar for DVA
3.9 years ago by
United States
DVA540 wrote:

Anyone here uses Jellyfish for whole genome sequencing data (directly re-formatted from fastq)? The input is ~100G and the command is like the following:

/home/jellyfish-2.2.6/bin/jellyfish count -m 14 -s 100M -o /hash/hash_sample_L_0_k_14.jf /sample/sample.fasta

System returns "Killed" after about 40min and I'm assuming it is due to a mem or swap exhaustion... I currently lowered the kmer length to 10, but would like to learn if there is some alternatives here. Thanks a lot.

Update: I tried 10 (-m 10), but it is also "Killed". Trying -m 5 now...

ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by DVA540

You could check free mem and swap with htop while running the program. There are only ~1M possible 10-mers so 100M initial hash is quite an overkill for that..

ADD REPLYlink written 3.9 years ago by 5heikki9.0k

Thanks so much for the reply. Could you please explain a little further? Is jellyfish taking all reads (I actually have 500M reads) into consideration at once? I thought the only memory consuming part is the 1M possible k-mers... Thanks a lot.

ADD REPLYlink written 3.9 years ago by DVA540
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 853 users visited in the last hour