Question: How does bbduk calculate it's memory requirements?
0
gravatar for Lina F
24 days ago by
Lina F150
Boston, MA
Lina F150 wrote:

I ran bbduk using:

/opt/bbmap/bbduk.sh in1=R1.fq.gz in2=R2.fq.gz out1=cleaned_R1.fq.gz out2=cleaned_R2.fq.gz ref=reference.fa k=31 hdist=1 stats=stats.txt overwrite=true

Which bbduk translated to:

java -ea -Xmx299m -Xms299m -cp /opt/bbmap/current/ jgi.BBDuk in1=R1.fq.gz in2=R2.fq.gz out1=cleaned_R1.fq.gz out2=cleaned_R2.fq.gz ref=reference.fa k=31 hdist=1 stats=stats.txt overwrite=true

Executing jgi.BBDuk [in1=R1.fq.gz, in2=R2.fq.gz, out1=cleaned_R1.fq.gz, out2=cleaned_R2.fq.gz, ref=reference.fa, k=31, hdist=1, stats=stats.txt, overwrite=true]
Version 38.44

0.011 seconds.
Initial:
Memory: max=301m, total=301m, free=282m, used=19m

java.lang.OutOfMemoryError
    at kmer.HashArray1D.resize(HashArray1D.java:216)
    at kmer.HashArray.setIfNotPresent(HashArray.java:210)
    at jgi.BBDuk$LoadThread.mutate(BBDuk.java:2251)
    at jgi.BBDuk$LoadThread.mutate(BBDuk.java:2272)
    at jgi.BBDuk$LoadThread.addToMap(BBDuk.java:2226)
    at jgi.BBDuk$LoadThread.addToMap(BBDuk.java:2124)
    at jgi.BBDuk$LoadThread.run(BBDuk.java:2033)

This program ran out of memory.
Try increasing the -Xmx flag and using tool-specific memory-related parameters.

As you can see, the Xmx and the Xms values are pretty low.

I am running this in a docker container and this is the memory I have available:

$> cat /proc/meminfo
MemTotal:        2046748 kB
MemFree:         1261068 kB
MemAvailable:    1536428 kB
Buffers:          186136 kB
Cached:           191120 kB
SwapCached:          800 kB
...

In the BBDuk reference on the webpage, I read the following regarding calculating Xmx and Xms:

BBDuk's shellscript will try to autodetect the available memory and use about half of it. You can override this with with the -Xmx flag, e.g. "bbduk.sh -Xmx1g in=reads.fq". That command will force it to use 1 GB. Most operations such as adapter-trimming and quality-trimming need only a tiny amount of memory. Only processing large references, or using a high value of "hdist" or "edist", actually need a lot of memory. The only factor determining how much memory BBDuk needs is the number of reference kmers stored, which is linearly proportional to the size of the reference. So, if you are not going to be using a reference, or only a small reference, you can add the flag -Xmx1g. If you will using a large reference, modify that flag to be around 85% of the machine's physical memory – for example, -Xmx27g on a 32GB machine. The actual maximum you can use depends on the operating system's configuration.

Specifically, it says "...autodetect the available memory and use about half of it...". Based on /proc/meminfo I have much more memory available than ~2x300 Mb.

Any ideas?

If possible I would like to take advantage of the "autodetection" feature because then I don't have to hard-code memory values.

Thanks for any suggestions!

memory docker bbduk • 127 views
ADD COMMENTlink modified 24 days ago • written 24 days ago by Lina F150
1

Lina F : bbduk.sh memory needs are very low. A couple of gigabytes is normally sufficient if you are just scanning for presence of primer/adapters (-Xmx2g).

The autodetect feature is supposed to work on standalone servers (which may not be applicable to VM's). What is the size of your reference.fa? Depending on that we can adjust the memory specification.

ADD REPLYlink modified 24 days ago • written 24 days ago by genomax68k

It's actually pretty small:

$> ls -lah reference.fa
-rwxrwxrwx 1 1002 1002 115K Oct 14  2016 reference.fa

I thought I could avoid setting Xmx and Xms manually. Maybe that is not the case.

Thanks for any suggestions you might have!

ADD REPLYlink written 24 days ago by Lina F150
1

Based on the info above you have assigned only 2GB RAM to this VM. Is it possible to use more? If not I would try setting -Xmx1g and see if that works. Your reference is small enough but the memory needs would be dependent on how many unique k-mers it generates.

ADD REPLYlink written 23 days ago by genomax68k

I upped the memory to 8 Gb for the docker container and it worked! bbduk used -Xmx1267m -Xms1267m

It seems like I underestimated how much memory the kmers take up and maybe the autodetect feature really does work differently in a docker container.

Thanks for helping me debug this!!

ADD REPLYlink written 23 days ago by Lina F150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 896 users visited in the last hour