GATK RealignerTargetCreator: IllegalArgumentException: Dictionary cannot have size zero
0
0
Entering edit mode
2.4 years ago
Pac314 ▴ 10

I am new to variant calling and trying to create realignment targets using GATK but keep getting this error, despite having a dictionary file:

 java.lang.IllegalArgumentException: Dictionary cannot have size zero
    at org.broadinstitute.gatk.utils.MRUCachingSAMSequenceDictionary.<init>(MRUCachingSAMSequenceDictionary.java:62)
    at org.broadinstitute.gatk.utils.GenomeLocParser$1.initialValue(GenomeLocParser.java:78)
    at org.broadinstitute.gatk.utils.GenomeLocParser$1.initialValue(GenomeLocParser.java:75)
    at java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:180)
    at java.lang.ThreadLocal.get(ThreadLocal.java:170)
    at org.broadinstitute.gatk.utils.GenomeLocParser.getContigInfo(GenomeLocParser.java:91)
    at org.broadinstitute.gatk.utils.GenomeLocParser.getContigs(GenomeLocParser.java:204)
    at org.broadinstitute.gatk.utils.GenomeLocParser.<init>(GenomeLocParser.java:135)
    at org.broadinstitute.gatk.utils.GenomeLocParser.<init>(GenomeLocParser.java:108)
    at org.broadinstitute.gatk.utils.GenomeLocSortedSet.createSetFromSequenceDictionary(GenomeLocSortedSet.java:421)
    at org.broadinstitute.gatk.engine.datasources.reads.BAMScheduler.createOverMappedReads(BAMScheduler.java:66)
    at org.broadinstitute.gatk.engine.datasources.reads.IntervalSharder.shardOverMappedReads(IntervalSharder.java:55)
    at org.broadinstitute.gatk.engine.datasources.reads.SAMDataSource.createShardIteratorOverMappedReads(SAMDataSource.java:1217)
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.getShardStrategy(GenomeAnalysisEngine.java:649)
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:307)
    at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:255)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:157)
    at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)

My code for the command is:

java -jar -Xmx8G gatk.jar \
   -T RealignerTargetCreator \
   -R $ref \
   -I sample_marked.bam \
   -o sample_realignment_targets.list

The dictionary contains this:

@HD VN:1.5
@SQ SN:ST4  LN:61165649 M5:d17409fc1c12bcd44fcb59e2803c3659 UR:file:~/varcalling/reference.fa

Please could you help me solve this issue.

Java variant GATK calling • 1.3k views
ADD COMMENT
0
Entering edit mode

what are the outputs of

wc -l ref.fa.fai ref.dict
file  ref.fa.fai ref.dict
samtools view -H sample_marked.bam | grep '@SQ'
ADD REPLY
0
Entering edit mode

Thanks for you reply.

For this wc -l ref.fa.fai ref.dict the output is:

1 ref.fa.fai
2 ref.fa.dict
3 total

For file ref.fa.fai ref.dict:

ref.fa.fai:  ASCII text
ref.dict: ASCII text

For samtools view -H sample_marked.bam | grep '@SQ': There is not any output.

ADD REPLY
0
Entering edit mode

There is not any output.

this is your problem. The BAM file is missing a dictionary. Check your upstream workflow.

ADD REPLY
0
Entering edit mode

After I have marked the duplicates in the BAM, the BAM has an "@SQ" line but lacks read groups. When I manually add reads groups using AddOrReplaceReadGroups the BAM loses the "@SQ" line but retains the read groups. Why would this happen?

ADD REPLY

Login before adding your answer.

Traffic: 3063 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6