Picard: EstimateLibraryComplexity -> OutOfMemoryError
1
1
Entering edit mode
8.3 years ago

I want to run EstimateLibraryComplexity.jar with a 9.8GB big bam file, but I always get a OutOfMemoryError. I already tried -Xmx (up to 60GB) and still get the error. Has anybody an idea of how to run EstimateLibraryComplexity on bigger bam files?

That's my call and the error message:

$java -Xmx10g -jar EstimateLibraryComplexity.jar INPUT=file.bam OUTPUT=file.libraryComplexity [Wed Jun 04 21:43:08 CEST 2014] picard.sam.EstimateLibraryComplexity INPUT=[file.bam] OUTPUT=file.libraryComplexity MIN_IDENTICAL_BASES=5 MAX_DIFF_RATE=0.03 MIN_MEAN_QUALITY=20 MAX_GROUP_RATIO=500 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false [Wed Jun 04 21:43:08 CEST 2014] Executing as me@work on Linux 3.6.2-1.fc16.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_07-b10; Picard version: 1.114(444810c1de1433d9eca8130be63ccc7fd70a9499_1400593393) JdkDeflater INFO 2014-06-04 21:43:08 EstimateLibraryComplexity Will store 15494157 read pairs in memory before sorting. INFO 2014-06-04 21:43:13 EstimateLibraryComplexity Read 1,000,000 records. Elapsed time: 00:00:05s. Time for last 1,000,000: 5s. Last read position: chr10:38,239,480 .... INFO 2014-06-04 21:53:21 EstimateLibraryComplexity Read 30,000,000 records. Elapsed time: 00:10:13s. Time for last 1,000,000: 183s. Last read position: chr15:34,522,127 [Wed Jun 04 22:54:26 CEST 2014] picard.sam.EstimateLibraryComplexity done. Elapsed time: 71.30 minutes. Runtime.totalMemory()=5801312256 To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:2694) at java.lang.String.<init>(String.java:203) at java.lang.String.substring(String.java:1913) at htsjdk.samtools.util.StringUtil.split(StringUtil.java:89) at picard.sam.AbstractDuplicateFindingAlgorithm.addLocationInformation(AbstractDuplicateFindingAlgorithm.java:71) at picard.sam.EstimateLibraryComplexity.doWork(EstimateLibraryComplexity.java:256) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183) at picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124) at picard.sam.EstimateLibraryComplexity.main(EstimateLibraryComplexity.java:217)  And that's the java version: $ java -showversion
java version "1.7.0_07"
Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)


EDIT: I also posted this question at SEQanswers!

java bam picard • 4.5k views
0
Entering edit mode

This smacks of a bug in the program. Especially since it happened after over an hour of runtime. What version of picard tools are you running?

0
Entering edit mode

Picard version: 1.114

0
Entering edit mode

Thanks! Now I see it was waaaay over to the right in your original post! >.<

0
Entering edit mode

Out of curiosity, did it actually max out the space allocated when you used -Xmx60g?

0
Entering edit mode

I don't know any more. But when I check the used memory for the run above, it looks like it only used ~5GB (Runtime.totalMemory()=5801312256), doesn't it?

0
Entering edit mode

Indeed, this sounds like a bug. You might post a message to the samtools-help email list and see if one of the authors have run into this (if not, it looks like there's a bug report to be filed).

0
Entering edit mode

Here's another possibility: your tmp location is being filled up by the operation, so the error is actually triggered when you run out of swap disk. Do you mind checking the location of your /tmp/ folder, and the amount of free space on its host volume?

In the past I've resolved this by symlinking /tmp to a folder on a large volume.

0
Entering edit mode

I tracked the free space of the volume and the size of the /tmp/ folder and both are far away from being filled up. But thanks for the idea... was worth a try.

0
Entering edit mode

Have you tried raising the MIN_IDENTICAL_BASES parameter to something like 10 or even 15? With a BAM file that size, it actually makes sense that you would run out of memory during the sort step.

0
Entering edit mode

Hi David,

I know it's been a long time since you posted this thread.

I was curious to know on the resolution of the error ?

Thanks

0
Entering edit mode

Sorry, but there is no update. I just stopped using PicardTools.

0
Entering edit mode
11 weeks ago
Sumaya • 0

Try to make a temporary direcotory for a temporary storage of Picard working files and then include the path of this directory with the option --TMP_DIR I hope this would help after a very long time when this thread was posted !