gatk collectallelic counts error
0
1
Entering edit mode
11 months ago
Peter Chung ▴ 140

I am running the gatk collect allelic counts function, but there is an error that I don't know how to fix. Interval list is downloaded from gatk resource bundle: https://storage.cloud.google.com/genomics-public-data/resources/broad/hg38/v0/wgs_calling_regions.hg38.interval_list and then I use gatk processinterval function to create it.

my script is below:

WD="/home/Desktop/CNV"
REF="${WD}/ref/hg38.fasta"
INT="${WD}/ref/wgs.hg38.interval_list"
DICT="${WD}/ref/hg38.fasta.dict"

time gatk --java-options "-Xmx16g -Djava.io.tmpdir=${TMPFILE}" CollectAllelicCounts \
--intervals ${INT} \
--input ${NAME}.addRG.mkdup.recal.bam \
--reference ${REF} \
--tmp-dir ${TMPFILE} \
--sequence-dictionary ${DICT} \ 
--output ${NAME}.allelic_counts.tsv

and the error is below:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3181) at java.util.ArrayList.grow(ArrayList.java:265) at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239) at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231) at java.util.ArrayList.add(ArrayList.java:462) at org.broadinstitute.hellbender.tools.copynumber.datacollection.AllelicCountCollector.collectAtLocus(AllelicCountCollector.java:72) at org.broadinstitute.hellbender.tools.copynumber.CollectAllelicCounts.apply(CollectAllelicCounts.java:152) at org.broadinstitute.hellbender.engine.LocusWalker.lambda$traverse$0(LocusWalker.java:176) at org.broadinstitute.hellbender.engine.LocusWalker$$Lambda$91/1519482659.accept(Unknown Source) at java.util.Iterator.forEachRemaining(Iterator.java:116) at org.broadinstitute.hellbender.engine.LocusWalker.traverse(LocusWalker.java:174) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289)

I tried to increase the memory but it didn't work as well.

gatk java • 303 views
ADD COMMENT
0
Entering edit mode

What is ${TMPFILE}?

ADD REPLY
0
Entering edit mode
mkdir tmp

TMPFILE=$(`pwd`)/tmp

a temporary directory

ADD REPLY
0
Entering edit mode

I would test the script with a small subset of both the BAM file and the interval file to see if it is indeed a memory/size error or something more general.

ADD REPLY
0
Entering edit mode

yes I tried to subset the bam file and interval file as well. but also have similar error

[May 22, 2020 12:15:40 PM HKT] org.broadinstitute.hellbender.tools.copynumber.CollectAllelicCounts done. Elapsed time: 21.76 minutes. Runtime.totalMemory()=15772155904 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3181) at java.util.ArrayList.grow(ArrayList.java:265) at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239) at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231) at java.util.ArrayList.add(ArrayList.java:462) at org.broadinstitute.hellbender.tools.copynumber.datacollection.AllelicCountCollector.collectAtLocus(AllelicCountCollector.java:72) at org.broadinstitute.hellbender.tools.copynumber.CollectAllelicCounts.apply(CollectAllelicCounts.java:152) at org.broadinstitute.hellbender.engine.LocusWalker.lambda$traverse$0(LocusWalker.java:176) at org.broadinstitute.hellbender.engine.LocusWalker$$Lambda$91/2118482375.accept(Unknown Source) at java.util.Iterator.forEachRemaining(Iterator.java:116) at org.broadinstitute.hellbender.engine.LocusWalker.traverse(LocusWalker.java:174) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289)

real 21m49.559s user 140m38.016s sys 0m15.373s

ADD REPLY
0
Entering edit mode

I already subset the bam file from 10X to 1X but the error is still there. I used the gatk4.1.7 in conda environment.

ADD REPLY
0
Entering edit mode

Then the error is more general. Try to run it without variables such as $tmp and outside of the script you are using to narrow down the problem.

ADD REPLY
0
Entering edit mode

yes. tried. no variables in the script. same error came out.

ADD REPLY
0
Entering edit mode

Then I would contact the developers.

ADD REPLY

Login before adding your answer.

Traffic: 2760 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6