Question: gatk collectallelic counts error
1
gravatar for Peter Chung
6 months ago by
Peter Chung120
Hong Kong
Peter Chung120 wrote:

I am running the gatk collect allelic counts function, but there is an error that I don't know how to fix. Interval list is downloaded from gatk resource bundle: https://storage.cloud.google.com/genomics-public-data/resources/broad/hg38/v0/wgs_calling_regions.hg38.interval_list and then I use gatk processinterval function to create it.

my script is below:

WD="/home/Desktop/CNV"
REF="${WD}/ref/hg38.fasta"
INT="${WD}/ref/wgs.hg38.interval_list"
DICT="${WD}/ref/hg38.fasta.dict"

time gatk --java-options "-Xmx16g -Djava.io.tmpdir=${TMPFILE}" CollectAllelicCounts \
--intervals ${INT} \
--input ${NAME}.addRG.mkdup.recal.bam \
--reference ${REF} \
--tmp-dir ${TMPFILE} \
--sequence-dictionary ${DICT} \ 
--output ${NAME}.allelic_counts.tsv

and the error is below:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3181) at java.util.ArrayList.grow(ArrayList.java:265) at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239) at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231) at java.util.ArrayList.add(ArrayList.java:462) at org.broadinstitute.hellbender.tools.copynumber.datacollection.AllelicCountCollector.collectAtLocus(AllelicCountCollector.java:72) at org.broadinstitute.hellbender.tools.copynumber.CollectAllelicCounts.apply(CollectAllelicCounts.java:152) at org.broadinstitute.hellbender.engine.LocusWalker.lambda$traverse$0(LocusWalker.java:176) at org.broadinstitute.hellbender.engine.LocusWalker$$Lambda$91/1519482659.accept(Unknown Source) at java.util.Iterator.forEachRemaining(Iterator.java:116) at org.broadinstitute.hellbender.engine.LocusWalker.traverse(LocusWalker.java:174) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289)

I tried to increase the memory but it didn't work as well.

java gatk • 170 views
ADD COMMENTlink written 6 months ago by Peter Chung120

What is ${TMPFILE}?

ADD REPLYlink written 6 months ago by ATpoint41k
mkdir tmp

TMPFILE=$(`pwd`)/tmp

a temporary directory

ADD REPLYlink modified 6 months ago • written 6 months ago by Peter Chung120

I would test the script with a small subset of both the BAM file and the interval file to see if it is indeed a memory/size error or something more general.

ADD REPLYlink written 6 months ago by ATpoint41k

yes I tried to subset the bam file and interval file as well. but also have similar error

[May 22, 2020 12:15:40 PM HKT] org.broadinstitute.hellbender.tools.copynumber.CollectAllelicCounts done. Elapsed time: 21.76 minutes. Runtime.totalMemory()=15772155904 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3181) at java.util.ArrayList.grow(ArrayList.java:265) at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239) at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231) at java.util.ArrayList.add(ArrayList.java:462) at org.broadinstitute.hellbender.tools.copynumber.datacollection.AllelicCountCollector.collectAtLocus(AllelicCountCollector.java:72) at org.broadinstitute.hellbender.tools.copynumber.CollectAllelicCounts.apply(CollectAllelicCounts.java:152) at org.broadinstitute.hellbender.engine.LocusWalker.lambda$traverse$0(LocusWalker.java:176) at org.broadinstitute.hellbender.engine.LocusWalker$$Lambda$91/2118482375.accept(Unknown Source) at java.util.Iterator.forEachRemaining(Iterator.java:116) at org.broadinstitute.hellbender.engine.LocusWalker.traverse(LocusWalker.java:174) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289)

real 21m49.559s user 140m38.016s sys 0m15.373s

ADD REPLYlink written 6 months ago by Peter Chung120

I already subset the bam file from 10X to 1X but the error is still there. I used the gatk4.1.7 in conda environment.

ADD REPLYlink written 6 months ago by Peter Chung120

Then the error is more general. Try to run it without variables such as $tmp and outside of the script you are using to narrow down the problem.

ADD REPLYlink written 6 months ago by ATpoint41k

yes. tried. no variables in the script. same error came out.

ADD REPLYlink written 6 months ago by Peter Chung120

Then I would contact the developers.

ADD REPLYlink written 6 months ago by ATpoint41k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1439 users visited in the last hour