Question: Can someone please help me run any GATK4 pipeline?
0
gravatar for moxu
18 months ago by
moxu440
moxu440 wrote:

GATK4 is a great variant calling software product and is dominating in the field. Unfortunately, I am not able to run any of the GATK4 pipelines, probably because these pipelines all use google storage which I am not familiar with? Now I am trying to run the $5 variant calling pipeline because I guess it's the easiest to run.

I downloaded the pipeline using:

git clone https://github.com/gatk-workflows/five-dollar-genome-analysis-pipeline.git

The command line used was:

java -jar cromwell-31.jar run germline_single_sample_workflow.wdl --inputs germline_single_sample_workflow.hg38.inputs.json

Both the .wdl file and the .json file are included in the GitHub package and unchanged when the above command line was run.

The (error) messages I got from the above command line execution are attached at the bottom of this post.

Can someone please tell me what I need to do to make this work?

Thanks much!

--------------------------------- (error) messages snippets -------------------------------------

[2018-05-31 10:08:28,06] [info] Running with database db.url = jdbc:hsqldb:mem:9f3b961e-97d8-4fc4-a30a-7e86c6f14bdc;shutdown=false;hsqldb.tx=mvcc
[2018-05-31 10:08:32,43] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-05-31 10:08:32,44] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-05-31 10:08:32,52] [info] Running with database db.url = jdbc:hsqldb:mem:11019e42-b5ee-4466-bbad-80dbf98f3c00;shutdown=false;hsqldb.tx=mvcc
[2018-05-31 10:08:32,81] [info] Slf4jLogger started
[2018-05-31 10:08:32,98] [info] Metadata summary refreshing every 2 seconds.
[2018-05-31 10:08:33,01] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.

...

[2018-05-31 10:08:41,71] [[38;5;220mwarn[0m] Local [[38;5;2m6b73056d[0m]: Key/s [memory, disks, preemptible] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-05-31 10:08:41,71] [[38;5;220mwarn[0m] Local [[38;5;2m6b73056d[0m]: Key/s [preemptible, disks, cpu, memory] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-05-31 10:08:43,92] [info] WorkflowExecutionActor-6b73056d-8171-4712-a05a-b8dfcdeb36d6 [[38;5;2m6b73056d[0m]: Starting germline_single_sample_workflow.ScatterIntervalList
[2018-05-31 10:08:44,97] [info] fe6db5b3-d91a-40d4-a35b-cf3c937deaaa-SubWorkflowActor-SubWorkflow-to_bam_workflow:-1:1 [[38;5;2mfe6db5b3[0m]: Starting to_bam_workflow.GetBwaVersion
[2018-05-31 10:08:45,95] [[38;5;220mwarn[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.GetBwaVersion:NA:1]: Unrecognized runtime attribute keys: memory
[2018-05-31 10:08:45,95] [[38;5;220mwarn[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2m6b73056d[0mgermline_single_sample_workflow.ScatterIntervalList:NA:1]: Unrecognized runtime attribute keys: memory
[2018-05-31 10:08:45,98] [[38;5;1merror[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2m6b73056d[0mgermline_single_sample_workflow.ScatterIntervalList:NA:1]: Error attempting to Execute
java.lang.Exception: Failed command instantiation
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:400)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand$(StandardAsyncExecutionActor.scala:340)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.instantiatedCommand$lzycompute(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.instantiatedCommand(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.standard.StandardAsyncExecutionActor.commandScriptContents(StandardAsyncExecutionActor.scala:235)
at cromwell.backend.standard.StandardAsyncExecutionActor.commandScriptContents$(StandardAsyncExecutionActor.scala:234)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.commandScriptContents(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.writeScriptContents(SharedFileSystemAsyncJobExecutionActor.scala:140)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.writeScriptContents$(SharedFileSystemAsyncJobExecutionActor.scala:139)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.cromwell$backend$sfs$BackgroundAsyncJobExecutionActor$$super$writeScriptContents(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.sfs.BackgroundAsyncJobExecutionActor.writeScriptContents(BackgroundAsyncJobExecutionActor.scala:12)
at cromwell.backend.sfs.BackgroundAsyncJobExecutionActor.writeScriptContents$(BackgroundAsyncJobExecutionActor.scala:11)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.writeScriptContents(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute(SharedFileSystemAsyncJobExecutionActor.scala:123)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute$(SharedFileSystemAsyncJobExecutionActor.scala:121)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.execute(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.standard.StandardAsyncExecutionActor.$anonfun$executeAsync$1(StandardAsyncExecutionActor.scala:451)
at scala.util.Try$.apply(Try.scala:209)
at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync(StandardAsyncExecutionActor.scala:451)
at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync$(StandardAsyncExecutionActor.scala:451)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.executeAsync(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover(StandardAsyncExecutionActor.scala:744)
at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover$(StandardAsyncExecutionActor.scala:736)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.executeOrRecover(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.async.AsyncBackendJobExecutionActor.$anonfun$robustExecuteOrRecover$1(AsyncBackendJobExecutionActor.scala:65)
at cromwell.core.retry.Retry$.withRetry(Retry.scala:37)
at cromwell.backend.async.AsyncBackendJobExecutionActor.withRetry(AsyncBackendJobExecutionActor.scala:61)
at cromwell.backend.async.AsyncBackendJobExecutionActor.cromwell$backend$async$AsyncBackendJobExecutionActor$$robustExecuteOrRecover(AsyncBackendJobExecutionActor.scala:65)
at cromwell.backend.async.AsyncBackendJobExecutionActor$$anonfun$receive$1.applyOrElse(AsyncBackendJobExecutionActor.scala:88)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at akka.actor.Actor.aroundReceive(Actor.scala:514)
at akka.actor.Actor.aroundReceive$(Actor.scala:512)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.aroundReceive(ConfigAsyncJobExecutionActor.scala:191)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:527)
at akka.actor.ActorCell.invoke(ActorCell.scala:496)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: common.exception.AggregatedMessageException: Error(s):
:
java.lang.IllegalArgumentException: gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
at common.validation.Validation$ValidationTry$.toTry$extension1(Validation.scala:60)
at common.validation.Validation$ValidationTry$.toTry$extension0(Validation.scala:56)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:398)
... 42 common frames omitted
[2018-05-31 10:08:45,99] [info] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.GetBwaVersion:NA:1]: [38;5;5m# not setting set -o pipefail here because /bwa has a rc=1 and we dont want to allow rc=1 to succeed because

the sed may also fail with that error and that is something we actually want to fail on.
/usr/gitc/bwa 2>&1 | \
grep -e '^Version' | \
sed 's/Version: //'[0m
[2018-05-31 10:08:46,02] [info] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.GetBwaVersion:NA:1]: executing: docker run \
--cidfile /Users/moushengxu/softspace/mudroom/gatk/five-dollar-genome-analysis-pipeline/cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion/execution/docker_cid \
--rm -i \
\
--entrypoint /bin/bash \
-v /Users/moushengxu/softspace/mudroom/gatk/five-dollar-genome-analysis-pipeline/cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion:/cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion \
us.gcr.io/broad-gotc-prod/genomes-in-the-cloud@sha256:7bc64948a0a9f50ea55edb8b30c710943e44bd861c46a229feaf121d345e68ed /cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion/execution/script
[2018-05-31 10:08:46,10] [info] fe6db5b3-d91a-40d4-a35b-cf3c937deaaa-SubWorkflowActor-SubWorkflow-to_bam_workflow:-1:1 [[38;5;2mfe6db5b3[0m]: Starting to_bam_workflow.CreateSequenceGroupingTSV
[2018-05-31 10:08:47,08] [[38;5;220mwarn[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.CreateSequenceGroupingTSV:NA:1]: Unrecognized runtime attribute keys: preemptible, memory
[2018-05-31 10:08:47,08] [[38;5;1merror[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.CreateSequenceGroupingTSV:NA:1]: Error attempting to Execute
java.lang.Exception: Failed command instantiation
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:400)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand$(StandardAsyncExecutionActor.sca

...


java.lang.IllegalArgumentException: gs://broad-references/hg38/v0/Homo_sapiens_assembly38.dict exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
gs://broad-references/hg38/v0/Homo_sapiens_assembly38.dict exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
at common.validation.Validation$ValidationTry$.toTry$extension1(Validation.scala:60)
at common.validation.Validation$ValidationTry$.toTry$extension0(Validation.scala:56)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:398)
... 35 common frames omitted
[2018-05-31 10:12:05,44] [info] Automatic shutdown of the async connection
[2018-05-31 10:12:05,44] [info] Gracefully shutdown sentry threads.
[2018-05-31 10:12:05,44] [info] Starting coordinated shutdown from JVM shutdown hook

...

[2018-05-31 10:12:46,84] [info] WorkflowExecutionActor-6b73056d-8171-4712-a05a-b8dfcdeb36d6 [[38;5;2m6b73056d[0m]: WorkflowExecutionActor [[38;5;2m6b73056d[0m] aborted: SubWorkflow-to_bam_workflow:-1:1
[2018-05-31 10:12:47,72] [info] WorkflowManagerActor All workflows are aborted
[2018-05-31 10:12:47,72] [info] WorkflowManagerActor All workflows finished
[2018-05-31 10:12:47,72] [info] WorkflowManagerActor stopped
[2018-05-31 10:12:47,72] [info] Connection pools shut down
[2018-05-31 10:12:47,72] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] Shutting down JobStoreActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] Shutting down CallCacheWriteActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] SubWorkflowStoreActor stopped
[2018-05-31 10:12:47,72] [info] Shutting down ServiceRegistryActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] Shutting down DockerHashActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] Shutting down IoProxy - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] CallCacheWriteActor Shutting down: 0 queued messages to process
[2018-05-31 10:12:47,72] [info] CallCacheWriteActor stopped
[2018-05-31 10:12:47,72] [info] JobStoreActor stopped
[2018-05-31 10:12:47,72] [info] KvWriteActor Shutting down: 0 queued messages to process
[2018-05-31 10:12:47,72] [info] DockerHashActor stopped
[2018-05-31 10:12:47,72] [info] WriteMetadataActor Shutting down: 37 queued messages to process
[2018-05-31 10:12:47,72] [info] IoProxy stopped
[2018-05-31 10:12:47,73] [info] WriteMetadataActor Shutting down: processing 0 queued messages
[2018-05-31 10:12:47,73] [info] ServiceRegistryActor stopped
[2018-05-31 10:12:47,74] [info] Database closed
[2018-05-31 10:12:47,74] [info] Stream materializer shut down
[2018-05-31 10:12:47,74] [info] Message [cromwell.core.actor.StreamActorHelper$StreamFailed] without sender to Actor[akka://cromwell-system/deadLetters] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
ADD COMMENTlink modified 17 months ago by vdauwera960 • written 18 months ago by moxu440

What have you tried?

ADD REPLYlink written 18 months ago by YaGalbi1.4k
java -jar cromwell-31.jar run germline_single_sample_workflow.wdl --inputs germline_single_sample_workflow.hg38.inputs.json
ADD REPLYlink modified 18 months ago by RamRS24k • written 18 months ago by moxu440

What have you tried to solve the problem is what YaGalbi meant, not "What was the command you used?"

ADD REPLYlink written 18 months ago by RamRS24k

gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems at

ADD REPLYlink modified 18 months ago • written 18 months ago by cpad011212k

I noticed that. I ran it on my MacBook Pro (MacOS 10.13.4), which should be the supported filesystem "MacOSXFileSystem" Do I have to do anything as described at http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems?

ADD REPLYlink written 18 months ago by moxu440
3
gravatar for vdauwera
17 months ago by
vdauwera960
Cambridge, MA
vdauwera960 wrote:

Hey everyone, this is a bit of a rabbit hole, let's take a step back. If you're going to be running the pipelines on a platform other than Google Cloud you need to use a (slightly) different pipeline script (for computational efficiency) and manage data access differently. I can follow up here: https://gatkforums.broadinstitute.org/wdl/discussion/12111/can-someone-help-me-run-any-gatk4-pipeline

ADD COMMENTlink written 17 months ago by vdauwera960
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1842 users visited in the last hour