Question: HaplotypeCaller: java.lang.NullPointerException when multiple-line bed file is used
0
gravatar for godth13teen
8 months ago by
godth13teen50
godth13teen50 wrote:

I used HaplotypeCaller on the cram file of sample NA12878 from IGSR, the reference fasta is also downloaded from them, the bed interval is self-made (tab delimited). My specs:

  • GATK version used: 4.1.4.1
  • Exact GATK commands used:

    gatk HaplotypeCaller -I NA12878.final.cram -O NA12878.final.vcf -R GRCh38_full_analysis_set_plus_decoy_hla.fa -L vdj_hg38.bed
    
  • The entire error log:

    Runtime.totalMemory()=187695104
    java.lang.NullPointerException
    at java.base/java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:325)
    at java.base/java.util.ComparableTimSort.sort(ComparableTimSort.java:202)
    at java.base/java.util.Arrays.sort(Arrays.java:1315)
    at java.base/java.util.Arrays.sort(Arrays.java:1509)
    at java.base/java.util.ArrayList.sort(ArrayList.java:1749)
    at java.base/java.util.Collections.sort(Collections.java:145)
    at org.broadinstitute.hellbender.utils.IntervalUtils.sortAndMergeIntervals(IntervalUtils.java:492)
    at org.broadinstitute.hellbender.utils.IntervalUtils.getIntervalsWithFlanks(IntervalUtils.java:990)
    at org.broadinstitute.hellbender.utils.IntervalUtils.getIntervalsWithFlanks(IntervalUtils.java:1005)
    at org.broadinstitute.hellbender.engine.MultiIntervalLocalReadShard.<init>(MultiIntervalLocalReadShard.java:59)
    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.makeReadShards(AssemblyRegionWalker.java:104)
    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(AssemblyRegionWalker.java:84)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:137)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
    at org.broadinstitute.hellbender.Main.main(Main.java:292)
    

I have tried to run HaplotypeCaller with no -L, or -L with manual input, or bed file with single line, it works fine. But my actual bed file has multiple line, e.g.

chr2 88,857,361 89,330,679
chr14 105,566,277 106,879,844
chr22 22,026,076 22,922,913

Could you please help me on what have gone wrong?

Edit: summary of what I have tried:

  • Remove the comma for the example: work
  • Used another bed file, contain intervals from chr1 to chr9: work
  • Used another bed file, contain intervals from chr10: NOT work
  • Used another bed file, contain intervals from chr1 to chr10: NOT work
  • Used another bed file, contain intervals from chr11 to chr19: work
  • Used another bed file, contain intervals from chr11 to chr22: NOT work
  • Used another bed file, contain intervals from chr1 to chr9 and chr11 to chr19: work
gatk variant calling • 210 views
ADD COMMENTlink modified 8 months ago by Pierre Lindenbaum131k • written 8 months ago by godth13teen50

You should remove the all the , from coordinates and make sure the bed file is sorted and formatted properly.

BED file format

ADD REPLYlink modified 8 months ago • written 8 months ago by Arup Ghosh2.7k

I have used another bed file follow that but it still give Null result

chr1    47264754    47264933
chr1    47276480    47276621
chr1    47276812    47276856
chr1    47278168    47278295
chr1    47279154    47279278
chr1    47279581    47279735
chr1    47279881    47279987
chr10   135345089   135345238
chr10   135345628   135345788
chr10   135346196   135346372
chr10   135347260   135347401
chr10   135350567   135350754
chr10   135351255   135351396
chr10   135352284   135352468
ADD REPLYlink modified 8 months ago • written 8 months ago by godth13teen50

Have you sorted this bed file?

ADD REPLYlink written 8 months ago by Arup Ghosh2.7k

Yes, the file is sorted by chr and then position, am I right? I used the bed file exactly in this order

chr1    47264754    47264933
chr1    47276480    47276621
chr1    47276812    47276856
chr1    47278168    47278295
chr1    47279154    47279278
chr1    47279581    47279735
chr1    47279881    47279987
chr10   135345089   135345238
chr10   135345628   135345788
chr10   135346196   135346372
chr10   135347260   135347401
chr10   135350567   135350754
chr10   135351255   135351396
chr10   135352284   135352468
ADD REPLYlink written 8 months ago by godth13teen50

Please see my updated post

ADD REPLYlink written 8 months ago by godth13teen50

I have done some manual testing, I found that I can use many interval, as long as they in range of chr1 - chr9, every time I put a line of chr10 - chr22, this error appears. Really weird

ADD REPLYlink written 8 months ago by godth13teen50

It's hard to tell from this excerpt, what's wrong with chr10 - chr22. You can try the following troubleshooting steps.

1) Make sure the interval file is restricted to the chromosome size in your reference file (for both alignment and variant calling).

2) Make sure the bed file you are using is a valid one. Seqanswers thread

3) Chromosome nomenclature matches in your reference and interval file.

4) Make sure your bed file follows the recommended specification by GATK interval documentation.

C. BED files with extension .bed We also accept the widely-used BED format, where intervals are in the form <chr> <start> <stop>, with fields separated by tabs. However, you should be aware that this file format is 0-based for the start coordinates, so coordinates taken from 1-based formats (e.g. if you're cooking up a custom interval list derived from a file in a 1-based format) should be offset by 1. The GATK engine recognizes the .bed extension and interprets the coordinate system accordingly.

ADD REPLYlink modified 8 months ago • written 8 months ago by Arup Ghosh2.7k
3
gravatar for Pierre Lindenbaum
8 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

in hg38, the size of chr10 is 133797422.

There is a clear problem in your bed. Your intervals are out of the chr10.

chr10   135345089   135345238
chr10   135345628   135345788
chr10   135346196   135346372
chr10   135347260   135347401
chr10   135350567   135350754
chr10   135351255   135351396
chr10   135352284   135352468
ADD COMMENTlink modified 8 months ago • written 8 months ago by Pierre Lindenbaum131k

thank you for clarification

ADD REPLYlink written 8 months ago by godth13teen50

thank you for clarification

ADD REPLYlink written 8 months ago by godth13teen50

validate + close the question by clicking on the green mark on the left please.

ADD REPLYlink written 8 months ago by Pierre Lindenbaum131k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1572 users visited in the last hour