4.3 years ago
moxu ▴ 500

GATK4 is a great variant calling software product and is dominating in the field. Unfortunately, I am not able to run any of the GATK4 pipelines, probably because these pipelines all use google storage which I am not familiar with? Now I am trying to run the $5 variant calling pipeline because I guess it's the easiest to run. I downloaded the pipeline using: git clone https://github.com/gatk-workflows/five-dollar-genome-analysis-pipeline.git  The command line used was: java -jar cromwell-31.jar run germline_single_sample_workflow.wdl --inputs germline_single_sample_workflow.hg38.inputs.json  Both the .wdl file and the .json file are included in the GitHub package and unchanged when the above command line was run.
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.instantiatedCommand$lzycompute(ConfigAsyncJobExecutionActor.scala:191) at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.instantiatedCommand(ConfigAsyncJobExecutionActor.scala:191) at cromwell.backend.standard.StandardAsyncExecutionActor.commandScriptContents(StandardAsyncExecutionActor.scala:235) at cromwell.backend.standard.StandardAsyncExecutionActor.commandScriptContents$(StandardAsyncExecutionActor.scala:234)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.commandScriptContents(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.writeScriptContents(SharedFileSystemAsyncJobExecutionActor.scala:140)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.writeScriptContents$(SharedFileSystemAsyncJobExecutionActor.scala:139) at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.cromwell$backend$sfs$BackgroundAsyncJobExecutionActor$$superwriteScriptContents(ConfigAsyncJobExecutionActor.scala:191) at cromwell.backend.sfs.BackgroundAsyncJobExecutionActor.writeScriptContents(BackgroundAsyncJobExecutionActor.scala:12) at cromwell.backend.sfs.BackgroundAsyncJobExecutionActor.writeScriptContents(BackgroundAsyncJobExecutionActor.scala:11) at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.writeScriptContents(ConfigAsyncJobExecutionActor.scala:191) at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute(SharedFileSystemAsyncJobExecutionActor.scala:123) at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute(SharedFileSystemAsyncJobExecutionActor.scala:121) at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.execute(ConfigAsyncJobExecutionActor.scala:191) at cromwell.backend.standard.StandardAsyncExecutionActor.anonfunexecuteAsync1(StandardAsyncExecutionActor.scala:451) at scala.util.Try.apply(Try.scala:209) at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync(StandardAsyncExecutionActor.scala:451) at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync(StandardAsyncExecutionActor.scala:451) at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.executeAsync(ConfigAsyncJobExecutionActor.scala:191) at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover(StandardAsyncExecutionActor.scala:744) at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover(StandardAsyncExecutionActor.scala:736) at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.executeOrRecover(ConfigAsyncJobExecutionActor.scala:191) at cromwell.backend.async.AsyncBackendJobExecutionActor.anonfunrobustExecuteOrRecover1(AsyncBackendJobExecutionActor.scala:65) at cromwell.core.retry.Retry.withRetry(Retry.scala:37) at cromwell.backend.async.AsyncBackendJobExecutionActor.withRetry(AsyncBackendJobExecutionActor.scala:61) at cromwell.backend.async.AsyncBackendJobExecutionActor.cromwellbackendasyncAsyncBackendJobExecutionActor$$robustExecuteOrRecover(AsyncBackendJobExecutionActor.scala:65)
at cromwell.backend.async.AsyncBackendJobExecutionActoranonfun$receive$1.applyOrElse(AsyncBackendJobExecutionActor.scala:88)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at akka.actor.Actor.aroundReceive$(Actor.scala:512) at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.aroundReceive(ConfigAsyncJobExecutionActor.scala:191) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:527) at akka.actor.ActorCell.invoke(ActorCell.scala:496) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) at akka.dispatch.Mailbox.run(Mailbox.scala:224) at akka.dispatch.Mailbox.exec(Mailbox.scala:234) at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
Caused by: common.exception.AggregatedMessageException: Error(s):
:
java.lang.IllegalArgumentException: gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
at common.validation.Validation$ValidationTry$.toTry$extension1(Validation.scala:60) at common.validation.Validation$ValidationTry$.toTry$extension0(Validation.scala:56)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:398)
... 42 common frames omitted
[2018-05-31 10:08:45,99] [info] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.GetBwaVersion:NA:1]: [38;5;5m# not setting set -o pipefail here because /bwa has a rc=1 and we dont want to allow rc=1 to succeed because

the sed may also fail with that error and that is something we actually want to fail on.
/usr/gitc/bwa 2>&1 | \
grep -e '^Version' | \
sed 's/Version: //'[0m
[2018-05-31 10:08:46,02] [info] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.GetBwaVersion:NA:1]: executing: docker run \
--cidfile /Users/moushengxu/softspace/mudroom/gatk/five-dollar-genome-analysis-pipeline/cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion/execution/docker_cid \
--rm -i \
\
--entrypoint /bin/bash \
-v /Users/moushengxu/softspace/mudroom/gatk/five-dollar-genome-analysis-pipeline/cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion:/cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion \
[2018-05-31 10:08:46,10] [info] fe6db5b3-d91a-40d4-a35b-cf3c937deaaa-SubWorkflowActor-SubWorkflow-to_bam_workflow:-1:1 [[38;5;2mfe6db5b3[0m]: Starting to_bam_workflow.CreateSequenceGroupingTSV
[2018-05-31 10:08:47,08] [[38;5;220mwarn[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.CreateSequenceGroupingTSV:NA:1]: Unrecognized runtime attribute keys: preemptible, memory
[2018-05-31 10:08:47,08] [[38;5;1merror[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.CreateSequenceGroupingTSV:NA:1]: Error attempting to Execute
java.lang.Exception: Failed command instantiation
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:400)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand$(StandardAsyncExecutionActor.sca ... java.lang.IllegalArgumentException: gs://broad-references/hg38/v0/Homo_sapiens_assembly38.dict exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems gs://broad-references/hg38/v0/Homo_sapiens_assembly38.dict exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems at common.validation.Validation$ValidationTry$.toTry$extension1(Validation.scala:60)
at common.validation.Validation$ValidationTry$.toTry$extension0(Validation.scala:56) at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:398) ... 35 common frames omitted [2018-05-31 10:12:05,44] [info] Automatic shutdown of the async connection [2018-05-31 10:12:05,44] [info] Gracefully shutdown sentry threads. [2018-05-31 10:12:05,44] [info] Starting coordinated shutdown from JVM shutdown hook ... [2018-05-31 10:12:46,84] [info] WorkflowExecutionActor-6b73056d-8171-4712-a05a-b8dfcdeb36d6 [[38;5;2m6b73056d[0m]: WorkflowExecutionActor [[38;5;2m6b73056d[0m] aborted: SubWorkflow-to_bam_workflow:-1:1 [2018-05-31 10:12:47,72] [info] WorkflowManagerActor All workflows are aborted [2018-05-31 10:12:47,72] [info] WorkflowManagerActor All workflows finished [2018-05-31 10:12:47,72] [info] WorkflowManagerActor stopped [2018-05-31 10:12:47,72] [info] Connection pools shut down [2018-05-31 10:12:47,72] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800000 milliseconds [2018-05-31 10:12:47,72] [info] Shutting down JobStoreActor - Timeout = 1800000 milliseconds [2018-05-31 10:12:47,72] [info] Shutting down CallCacheWriteActor - Timeout = 1800000 milliseconds [2018-05-31 10:12:47,72] [info] SubWorkflowStoreActor stopped [2018-05-31 10:12:47,72] [info] Shutting down ServiceRegistryActor - Timeout = 1800000 milliseconds [2018-05-31 10:12:47,72] [info] Shutting down DockerHashActor - Timeout = 1800000 milliseconds [2018-05-31 10:12:47,72] [info] Shutting down IoProxy - Timeout = 1800000 milliseconds [2018-05-31 10:12:47,72] [info] CallCacheWriteActor Shutting down: 0 queued messages to process [2018-05-31 10:12:47,72] [info] CallCacheWriteActor stopped [2018-05-31 10:12:47,72] [info] JobStoreActor stopped [2018-05-31 10:12:47,72] [info] KvWriteActor Shutting down: 0 queued messages to process [2018-05-31 10:12:47,72] [info] DockerHashActor stopped [2018-05-31 10:12:47,72] [info] WriteMetadataActor Shutting down: 37 queued messages to process [2018-05-31 10:12:47,72] [info] IoProxy stopped [2018-05-31 10:12:47,73] [info] WriteMetadataActor Shutting down: processing 0 queued messages [2018-05-31 10:12:47,73] [info] ServiceRegistryActor stopped [2018-05-31 10:12:47,74] [info] Database closed [2018-05-31 10:12:47,74] [info] Stream materializer shut down [2018-05-31 10:12:47,74] [info] Message [cromwell.core.actor.StreamActorHelper$StreamFailed] without sender to Actor[akka://cromwell-system/deadLetters] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

What have you tried?

java -jar cromwell-31.jar run germline_single_sample_workflow.wdl --inputs germline_single_sample_workflow.hg38.inputs.json

What have you tried to solve the problem is what YaGalbi meant, not "What was the command you used?"

gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems

I noticed that. I ran it on my MacBook Pro (MacOS 10.13.4), which should be the supported filesystem "MacOSXFileSystem" Do I have to do anything as described at http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems?

4.3 years ago
vdauwera ★ 1.2k

Hey everyone, this is a bit of a rabbit hole, let's take a step back. If you're going to be running the pipelines on a platform other than Google Cloud you need to use a (slightly) different pipeline script (for computational efficiency) and manage data access differently. I can follow up here: https://gatkforums.broadinstitute.org/wdl/discussion/12111/can-someone-help-me-run-any-gatk4-pipeline

Was this ever followed up? I can't find the link - but was hoping to understand how to pass local directory to runtime {docker:, disks: } while running WDL pipeline on MacOS. I provided absolute path to the "disks" , but get "Unrecognized runtime attribute keys: disks " Not sure if I should open another question or its okay to continue here.