Entering edit mode
6.6 years ago
Ram
43k
I'm facing a strange GATK error in one of my operations. I'm not sure what's happening here. See:
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace
java.lang.IllegalArgumentException: Permitted to write any record upstream of position 195509918, but a record at 3:195509765 was just added.
at htsjdk.variant.variantcontext.writer.SortingVariantContextWriterBase.noteCurrentRecord(SortingVariantContextWriterBase.java:161)
at htsjdk.variant.variantcontext.writer.SortingVariantContextWriter.noteCurrentRecord(SortingVariantContextWriter.java:55)
at htsjdk.variant.variantcontext.writer.SortingVariantContextWriterBase.add(SortingVariantContextWriterBase.java:128)
at edu.mssm.gatk.walkers.variantutils.AnnotateLikelyPathogenic$RetainedVariants.writeUpTo(AnnotateLikelyPathogenic.java:684)
at edu.mssm.gatk.walkers.variantutils.AnnotateLikelyPathogenic$RetainedVariants.writeUpTo(AnnotateLikelyPathogenic.java:692)
at edu.mssm.gatk.walkers.variantutils.AnnotateLikelyPathogenic.map(AnnotateLikelyPathogenic.java:324)
at edu.mssm.gatk.walkers.variantutils.AnnotateLikelyPathogenic.map(AnnotateLikelyPathogenic.java:42)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:319)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 2.7.2b-987-g23a99ad):
##### ERROR
##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Permitted to write any record upstream of position 195509918, but a record at 3:195509765 was just added.
##### ERROR ------------------------------------------------------------------------------------------
Can anyone help me out here? I can provide more information once I get a direction on how to dig deeper into this.
It's a standard file used across the organization. I've used it multiple times, but yeah, the VCF file was created elsewhere so maybe I should get the genome index from the other org?
Using a shared system? - that can always lead to compatibility issues! If you could test it on your laptop/Mac, that would also work, just to see if it is indeed the problem.
I would if I could. It works fine when the same samples are split into smaller pools, and there are way too many (1000+) to run on my local computer. I think this might have happened because my java temp directory ran out of space. I cleared that up and tried a different technique and got the results, I'll try again later using the old technique once I get some time.
Good call. I had also wondered if it was a space or RAM issue, but wasn't sure to what extent.
It's not a RAM or space issue - I tried the old route once again, and it gave me a whole new error on features being out of order.
This could be because of the variety of tools and GATK versions used along the pipeline - the VCF was generated using GATK 3.6+, and I'm trying to process it using GATK 2.7 (which is what's used in my org's super-outdated pipeline). Anyway, it's done and I'm going to leave this here.
That's exactly what a Ram would say.
You made me laugh! - a rare event!
I have a theory: any sufficiently populated online forum is (at times) indistinguishable from reddit.
It's Friday afternoon in my timezone if that explains anything.
See? "It's Friday 5 PM somewhere"
Only 10:15AM here... I'm abroad working on a project...