Question: Picard: EstimateLibraryComplexity -> OutOfMemoryError
1
gravatar for David Langenberger
4.8 years ago by
Deutschland
David Langenberger8.7k wrote:

I want to run EstimateLibraryComplexity.jar with a 9.8GB big bam file, but I always get a OutOfMemoryError error. I already tried -Xmx (up to 60GB) and still get the error. Has anybody an idea of how to run EstimateLibraryComplexity on bigger bam files?

 

That's my call and the error message:

$ java -Xmx10g -jar EstimateLibraryComplexity.jar INPUT=file.bam OUTPUT=file.libraryComplexity

[Wed Jun 04 21:43:08 CEST 2014] picard.sam.EstimateLibraryComplexity INPUT=[file.bam] OUTPUT=file.libraryComplexity    MIN_IDENTICAL_BASES=5 MAX_DIFF_RATE=0.03 MIN_MEAN_QUALITY=20 MAX_GROUP_RATIO=500 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Wed Jun 04 21:43:08 CEST 2014] Executing as me@work on Linux 3.6.2-1.fc16.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_07-b10; Picard version: 1.114(444810c1de1433d9eca8130be63ccc7fd70a9499_1400593393) JdkDeflater
INFO    2014-06-04 21:43:08     EstimateLibraryComplexity       Will store 15494157 read pairs in memory before sorting.
INFO    2014-06-04 21:43:13     EstimateLibraryComplexity       Read     1,000,000 records.  Elapsed time: 00:00:05s.  Time for last 1,000,000:    5s.  Last read position: chr10:38,239,480

....

INFO    2014-06-04 21:53:21     EstimateLibraryComplexity       Read    30,000,000 records.  Elapsed time: 00:10:13s.  Time for last 1,000,000:  183s.  Last read position: chr15:34,522,127

[Wed Jun 04 22:54:26 CEST 2014] picard.sam.EstimateLibraryComplexity done. Elapsed time: 71.30 minutes.
Runtime.totalMemory()=5801312256
To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOfRange(Arrays.java:2694)
        at java.lang.String.<init>(String.java:203)
        at java.lang.String.substring(String.java:1913)
        at htsjdk.samtools.util.StringUtil.split(StringUtil.java:89)
        at picard.sam.AbstractDuplicateFindingAlgorithm.addLocationInformation(AbstractDuplicateFindingAlgorithm.java:71)
        at picard.sam.EstimateLibraryComplexity.doWork(EstimateLibraryComplexity.java:256)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)
        at picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124)
        at picard.sam.EstimateLibraryComplexity.main(EstimateLibraryComplexity.java:217)

 

And that's the java version:

$ java -showversion
java version "1.7.0_07"
Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)

 

EDIT: I also posted this question at SEQanswers!

bam picard java • 2.6k views
ADD COMMENTlink modified 2.8 years ago by Biostar ♦♦ 20 • written 4.8 years ago by David Langenberger8.7k

This smacks of a bug in the program. Especially since it happened after over an hour of runtime. What version of picard tools are you running?

ADD REPLYlink written 4.8 years ago by Dan D6.7k

Picard version: 1.114

ADD REPLYlink written 4.8 years ago by David Langenberger8.7k

Thanks! Now I see it was waaaay over to the right in your original post!  >.<

ADD REPLYlink written 4.8 years ago by Dan D6.7k

Out of curiosity, did it actually max out the space allocated when you used -Xmx60g?

ADD REPLYlink written 4.8 years ago by Devon Ryan88k

I don't know any more. But when I check the used memory for the run above, it looks like it only used ~5GB (Runtime.totalMemory()=5801312256), doesn't it?

ADD REPLYlink modified 4.8 years ago • written 4.8 years ago by David Langenberger8.7k

Indeed, this sounds like a bug. You might post a message to the samtools-help email list and see if one of the authors have run into this (if not, it looks like there's a bug report to be filed).

ADD REPLYlink written 4.8 years ago by Devon Ryan88k

Here's another possibility: your tmp location is being filled up by the operation, so the error is actually triggered when you run out of swap disk. Do you mind checking the location of your /tmp/ folder, and the amount of free space on its host volume?

In the past I've resolved this by symlinking /tmp to a folder on a large volume.

ADD REPLYlink written 4.8 years ago by Dan D6.7k

I tracked the free space of the volume and the size of the /tmp/ folder and both are far away from being filled up. But thanks for the idea... was worth a try.

ADD REPLYlink written 4.8 years ago by David Langenberger8.7k

Have you tried raising the MIN_IDENTICAL_BASES parameter to something like 10 or even 15? With a BAM file that size, it actually makes sense that you would run out of memory during the sort step.

ADD REPLYlink written 4.8 years ago by Dan D6.7k

Hi David,

I know it's been a long time since you posted this thread.

I was curious to know on the resolution of the error ?

Could you please update the thread ?

Thanks

ADD REPLYlink written 4.0 years ago by deepue110

Sorry, but there is no update. I just stopped using PicardTools.

ADD REPLYlink written 4.0 years ago by David Langenberger8.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 860 users visited in the last hour