Question: Gatk Indel Realignment Error - Mismatch In Index Files And Dict File
1
gravatar for Zev.Kronenberg
7.0 years ago by
United States
Zev.Kronenberg11k wrote:

Greetings,

I am aligning pooled sequencing data to a new renferece genome. GATK won't generate intervals because not every scaffold in the reference is found in my bam index?

What am I missing? It seems like what I am trying to do isn't unreasonable.

  INFO  15:52:42,596 GATKRunReport - Uploaded run statistics report to AWS S3 
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A USER ERROR has occurred (version 1.5-3-gbb2c10b): 
    ##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
    ##### ERROR Please do not post this error to the GATK forum
    ##### ERROR
    ##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
    ##### ERROR Visit our wiki for extensive documentation <http://www.broadinstitute.org/gsa/wiki>
    ##### ERROR Visit our forum to view answers to commonly asked questions <http://getsatisfaction.com/gsa>
    ##### ERROR
    ##### ERROR MESSAGE: Couldn't read file /home/zkronenb/Projects/xxx/reference_assembly/withoutsanger.fa because Sequence dictionary and index contain different numbers of contigs
indel gatk error • 3.4k views
ADD COMMENTlink written 7.0 years ago by Zev.Kronenberg11k
1
gravatar for Wen.Huang
7.0 years ago by
Wen.Huang1.1k
Wen.Huang1.1k wrote:

the latest GATK seems to be able to generate .dict on the fly, try remove the existing .dict and have GATK regenerate it

ADD COMMENTlink written 7.0 years ago by Wen.Huang1.1k

Thanks! I was stuck in old habits.

ADD REPLYlink written 7.0 years ago by Zev.Kronenberg11k
1
gravatar for Zev.Kronenberg
7.0 years ago by
United States
Zev.Kronenberg11k wrote:

https://getsatisfaction.com/gsa/topics/a_simple_problem_with_creating_dict_files

I was conducting indel realignments on a non-model organism. I was getting an error stating: loc:malformed unknown contig. The source of this error came from my reference fasta. The lines weren't wrapped so the last contig wasn't being added to my dict file. I wrapped the lines and everything works fine.

ADD COMMENTlink written 7.0 years ago by Zev.Kronenberg11k
0
gravatar for Johan
7.0 years ago by
Johan840
Sweden
Johan840 wrote:

In addition to wen.huangs answer. Make sure that the reference genome your are using contains the contig names as the genome you aligned against. You can check that they match by looking at the bam file using "samtools view -h", and compare the names there to those found in the reference.

ADD COMMENTlink written 7.0 years ago by Johan840
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1996 users visited in the last hour