I ran bbduk using:
/opt/bbmap/bbduk.sh in1=R1.fq.gz in2=R2.fq.gz out1=cleaned_R1.fq.gz out2=cleaned_R2.fq.gz ref=reference.fa k=31 hdist=1 stats=stats.txt overwrite=true
Which bbduk translated to:
java -ea -Xmx299m -Xms299m -cp /opt/bbmap/current/ jgi.BBDuk in1=R1.fq.gz in2=R2.fq.gz out1=cleaned_R1.fq.gz out2=cleaned_R2.fq.gz ref=reference.fa k=31 hdist=1 stats=stats.txt overwrite=true Executing jgi.BBDuk [in1=R1.fq.gz, in2=R2.fq.gz, out1=cleaned_R1.fq.gz, out2=cleaned_R2.fq.gz, ref=reference.fa, k=31, hdist=1, stats=stats.txt, overwrite=true] Version 38.44 0.011 seconds. Initial: Memory: max=301m, total=301m, free=282m, used=19m java.lang.OutOfMemoryError at kmer.HashArray1D.resize(HashArray1D.java:216) at kmer.HashArray.setIfNotPresent(HashArray.java:210) at jgi.BBDuk$LoadThread.mutate(BBDuk.java:2251) at jgi.BBDuk$LoadThread.mutate(BBDuk.java:2272) at jgi.BBDuk$LoadThread.addToMap(BBDuk.java:2226) at jgi.BBDuk$LoadThread.addToMap(BBDuk.java:2124) at jgi.BBDuk$LoadThread.run(BBDuk.java:2033) This program ran out of memory. Try increasing the -Xmx flag and using tool-specific memory-related parameters.
As you can see, the Xmx and the Xms values are pretty low.
I am running this in a docker container and this is the memory I have available:
$> cat /proc/meminfo MemTotal: 2046748 kB MemFree: 1261068 kB MemAvailable: 1536428 kB Buffers: 186136 kB Cached: 191120 kB SwapCached: 800 kB ...
In the BBDuk reference on the webpage, I read the following regarding calculating Xmx and Xms:
BBDuk's shellscript will try to autodetect the available memory and use about half of it. You can override this with with the -Xmx flag, e.g. "bbduk.sh -Xmx1g in=reads.fq". That command will force it to use 1 GB. Most operations such as adapter-trimming and quality-trimming need only a tiny amount of memory. Only processing large references, or using a high value of "hdist" or "edist", actually need a lot of memory. The only factor determining how much memory BBDuk needs is the number of reference kmers stored, which is linearly proportional to the size of the reference. So, if you are not going to be using a reference, or only a small reference, you can add the flag -Xmx1g. If you will using a large reference, modify that flag to be around 85% of the machine's physical memory – for example, -Xmx27g on a 32GB machine. The actual maximum you can use depends on the operating system's configuration.
Specifically, it says "...autodetect the available memory and use about half of it...". Based on /proc/meminfo I have much more memory available than ~2x300 Mb.
If possible I would like to take advantage of the "autodetection" feature because then I don't have to hard-code memory values.
Thanks for any suggestions!