Question: toil stats under-reporting memory usage?
1
gravatar for ionox0
4 weeks ago by
ionox0240
ionox0240 wrote:

The output from toil stats <jobstore> seems to be reporting values ~5000K (5Mb), but I believe it should be more like 5G

 file:///home/johnsoni/pipeline_0.0.39/ACCESS-Pipeline/cwl_tools/bwa-mem/bwa-mem.cwl
                               Memory
min     med     ave     max    total
5659K   5950K   5854K   6128K  128807K

Is there something peculiar about this report? Here is how I'm specifying the resource requirements:

requirements:
  - class: InlineJavascriptRequirement
  - class: ResourceRequirement
    ramMin: 30000
    coresMin: 4
    outdirMax: 20000

I would expect it to be using at least 1G on reasonably-sized fastqs

cwl toil • 182 views
ADD COMMENTlink modified 28 days ago • written 4 weeks ago by ionox0240

For bwa mem, memory usage would depend mainly on genome index size, not fastq size. Is your reference genome very small?

ADD REPLYlink written 4 weeks ago by h.mon21k

It's actually the full sized hg19 reference, would you happen to be able to confirm the units for this command perchance?

ADD REPLYlink written 4 weeks ago by ionox0240

I don't use CWL or Toil, so I really don't know what they report. But bwa mem with the human genome should use about 8Gb memory, so the numbers seem off.

ADD REPLYlink written 4 weeks ago by h.mon21k
3
gravatar for ionox0
28 days ago by
ionox0240
ionox0240 wrote:

I haven't tracked this down fully, but it seems like this line:

tag_str += reportMemory(t, options, field=width, isBytes=True)

means that reportMemory will divide by 1024:

if isBytes:
    k /= 1024.

If the memory values comes from resource.getrusage then I believe the memory should already in KB:

me = resource.getrusage(resource.RUSAGE_SELF)
childs = resource.getrusage(resource.RUSAGE_CHILDREN)

See https://stackoverflow.com/questions/12050913/whats-the-unit-of-ru-maxrss-on-linux

Therefore dividing by 1024 on this line would result in MB, not KB:

https://github.com/DataBiosphere/toil/blob/master/src/toil/utils/toilStats.py#L211

ADD COMMENTlink written 28 days ago by ionox0240
3

https://github.com/DataBiosphere/toil/pull/2425

ADD REPLYlink written 28 days ago by ionox0240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 775 users visited in the last hour