Question: Large input when using jellyfish
1
gravatar for DVA
3.0 years ago by
DVA530
United States
DVA530 wrote:

Anyone here uses Jellyfish for whole genome sequencing data (directly re-formatted from fastq)? The input is ~100G and the command is like the following:

/home/jellyfish-2.2.6/bin/jellyfish count -m 14 -s 100M -o /hash/hash_sample_L_0_k_14.jf /sample/sample.fasta

System returns "Killed" after about 40min and I'm assuming it is due to a mem or swap exhaustion... I currently lowered the kmer length to 10, but would like to learn if there is some alternatives here. Thanks a lot.

Update: I tried 10 (-m 10), but it is also "Killed". Trying -m 5 now...

ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by DVA530
1

You could check free mem and swap with htop while running the program. There are only ~1M possible 10-mers so 100M initial hash is quite an overkill for that..

ADD REPLYlink written 3.0 years ago by 5heikki8.6k

Thanks so much for the reply. Could you please explain a little further? Is jellyfish taking all reads (I actually have 500M reads) into consideration at once? I thought the only memory consuming part is the 1M possible k-mers... Thanks a lot.

ADD REPLYlink written 3.0 years ago by DVA530
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1898 users visited in the last hour