How do you get input files to a cromwell server
1
0
Entering edit mode
19 months ago
Irsan ★ 7.4k

I make use of cromwell in server mode to execute bioinformatics data processing pipelines written in widdle (.wdl = workflow description language). The machine from which I submit workflows is not the same as the machine that hosts the cromwell server (so a remote cromwell server). The input files for the workflow I want to submit are stored on the machine from which I submit, and they need to be available on the cromwell server machine? Should I first copy my files and then submit the workflow? Or is there a way to submit the workflow and let cromwell automatically copy the input files over?

cromwell wdl widdle workflow • 853 views
ADD COMMENT
0
Entering edit mode
19 months ago
vdauwera ★ 1.1k

Hi @Irsan, I was notified of your question to the Broad’s email system. Posting the answer here so it can benefit others.

By default, Cromwell will copy the files from their storage location to the machine where execution happens. The Cromwell server just needs to be able to recognize and access the file system you use for storage. However it’s possible to modify this behavior to use soft links if the machine where execution happens is able to access the relevant file system directly.

Note that the Cromwell team currently doesn’t have the bandwidth to provide direct support, so we recently made the decision to start encouraging Cromwell users to post their questions on bioinformatics.stackexchange.com. Since there’s a growing number of people using Cromwell worldwide, we’re hoping to foster a peer support community for for people to help each other. We chose stackexchange because that’s where the OpenWDL community has decided to use as the place to go for help with WDL. There’s not a 1:1 relation between WDL and Cromwell, but since there’s a lot of overlap, it makes sense to have their support in the same place.

ADD COMMENT
0
Entering edit mode

Hi Geraldine (I think??), How do I make the cromwell server machine recognize the client/submission machine's file system? For example, do you recommend to to make my data directory on the client shareable over the network (SMB/CIFS), or can I adapt the URI string in my input file to something that will tell cromwell server to copy via ssh (e.g. ssh://client.machine.name/full/path/to/file.fastq)? Do I have to change anything in the cromwell configuration (there is a filesystem block)?

BTW, since the bioinformatics user base on biostars is 10 times larger then on bioinformatics.stackexchange.com I would recommend to move here :-)

ADD REPLY

Login before adding your answer.

Traffic: 1437 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6