Question: Is there a schema for CWL inputs/job files?
0
gravatar for karl.sebby
7 months ago by
karl.sebby70
karl.sebby70 wrote:

One thing I really like about CWL is the ability to load CWL files into a CommandLineTool or Workflow object after generating the classes using schema-salad-tool --codegen=python CommonWorkflowLanguage.yml > cwl_classes.py. Is there a schema file that describes CWL inputs/job files similar to CommonWorkflowLanguage.yml? I have been working with these files as dicts, but would be super nice to be able to load them straight into a class object.

cwl • 1.0k views
ADD COMMENTlink modified 4 months ago • written 7 months ago by karl.sebby70
2
gravatar for Michael R. Crusoe
7 months ago by
Common Workflow Language project
Michael R. Crusoe1.8k wrote:

Hello Karl,

Yes, the inputs section of a CWL document is a schema for the input job object.

ADD COMMENTlink written 7 months ago by Michael R. Crusoe1.8k
1

Thanks! I remember coming across this at some point now...

ADD REPLYlink written 7 months ago by karl.sebby70
2
gravatar for peter.amstutz
7 months ago by
peter.amstutz300
peter.amstutz300 wrote:

To expand a bit on what Michael said, the "inputs" and "outputs" section of every tool or workflow is a schema for the input object, so (although I have not tried it) it probably is not much more complicated than dumping the inputs section and using the code generator on it.

ADD COMMENTlink written 7 months ago by peter.amstutz300
1

Thanks. Will give it a try!

ADD REPLYlink written 7 months ago by karl.sebby70

OK. So I've gotten around to giving this a try and I'm hitting some issues. To keep things simple I'm playing around with the echo example, 1st-tool.cwl, and echo-job.yml used in the user guide https://github.com/common-workflow-language/user_guide/tree/gh-pages/_includes/cwl/02-1st-example. I've tried to validate the inputs section using $schema-salad-tool inputs.yml where inputs.yml is used just as the inputs section is written:

message:
  type: string
  inputBinding:
    position: 1

or after it has been loaded and then dumped/saved using the generated python classes:

 - id: file:///sandbox/echo.cwl#message
    inputBinding:
      position: 1
    type: string

Both forms suffer from the same issue; they are not a valid SaladRecordSchema, SaldEnumSchema, or Documentation field which has let me to creating this, which does validate:

- name: Inputs
  documentRoot: true
  type: record
  fields:
    inputs:
      type:
        type: array
        items: Input


- name: Input
  type: record
  fields:
    message:
        type: string

Does it seem like I'm going down the right path, or am I making things more complicated than they need to be?

ADD REPLYlink written 5 months ago by karl.sebby70
1

Hey Karl.

Yep, you are very close to a valid schema salad representation for the 1st_tool.cwl's input section. You'll need to add - $import: "schema_salad/metaschema/metaschema_base.yml" to the beginning of the document and leave out the Inputs section, then the following will work:

$ schema-salad-tool biostars-383396.yml  echo-job.yml 
/home/michael/schema_salad/env3.7/bin/schema-salad-tool Current version: 4.5.20190815125611
Document `echo-job.yml` is valid
$ schema-salad-tool --codegen python biostars-383396.yml > cwl_utils/first_tool.py
/home/michael/schema_salad/env3.7/bin/schema-salad-tool Current version: 4.5.20190815125611
$ python -c "from cwl_utils.first_tool import load_document; print(load_document('echo-job.yml').message)"
INFO:rdflib:RDFLib Version: 4.2.2
Hello world!

Now we need to find an automated method of doing that. I've opened a feature request at https://github.com/common-workflow-language/schema_salad/issues/276 for that (maybe belongs in cwltool, we'll see).

ADD REPLYlink modified 5 months ago • written 5 months ago by Michael R. Crusoe1.8k

Awesome! I'll give this a try and then try out some File and Directory inputs.

ADD REPLYlink written 5 months ago by karl.sebby70
0
gravatar for karl.sebby
4 months ago by
karl.sebby70
karl.sebby70 wrote:

Here's what I ended up with and has been working for the cases I've tested so far.

$base: "https://w3id.org/cwl/cwl#"

$namespaces:
  cwl: "https://w3id.org/cwl/cwl#"
  sld: "https://w3id.org/cwl/salad#"
  rdfs: "http://www.w3.org/2000/01/rdf-schema#"

$graph:

# items from Process.yml

- $import: metaschema_base.yml

- name: CWLType
  type: enum
  extends: "sld:PrimitiveType"
  symbols:
    - cwl:File
    - cwl:Directory

- name: File
  type: record
  docParent: "#CWLType"
  doc:
  fields:
    - name: class
      type:
        type: enum
        name: File_class
        symbols:
          - cwl:File
      jsonldPredicate:
        _id: "@type"
        _type: "@vocab"

    - name: location
      type: string?
      jsonldPredicate:
        _id: "@id"
        _type: "@id"

    - name: path
      type: string?
      jsonldPredicate:
        "_id": "cwl:path"
        "_type": "@id"

    - name: basename
      type: string?
      jsonldPredicate: "cwl:basename"

    - name: dirname
      type: string?

    - name: nameroot
      type: string?

    - name: nameext
      type: string?

    - name: checksum
      type: string?

    - name: size
      type: long?

    - name: "secondaryFiles"
      type:
        - "null"
        - type: array
          items: [File, Directory]
      jsonldPredicate: "cwl:secondaryFiles"

    - name: format
      type: string?
      jsonldPredicate:
        _id: cwl:format
        _type: "@id"
        identity: true

    - name: contents
      type: string?

- name: Directory
  type: record
  fields:
    - name: class
      type:
        type: enum
        name: Directory_class
        symbols:
          - cwl:Directory
      jsonldPredicate:
        _id: "@type"
        _type: "@vocab"

    - name: location
      type: string?
      jsonldPredicate:
        _id: "@id"
        _type: "@id"

    - name: path
      type: string?
      jsonldPredicate:
        _id: "cwl:path"
        _type: "@id"

    - name: basename
      type: string?
      jsonldPredicate: "cwl:basename"

    - name: listing
      type:
        - "null"
        - type: array
          items: [File, Directory]
      jsonldPredicate:
        _id: "cwl:listing"



- name: InputsField
  type: record
  documentRoot: true
  fields: ~

I then populate the InputsFields.fields with a map of name: type from the cwl file. e.g. for a single optional input called inFiles that expects an array of Files.

 - name: InputsField
      type: record
      documentRoot: true
      fields:
        inFiles:
          type:
            - "null"
            - type: array
              items: File
ADD COMMENTlink modified 4 months ago • written 4 months ago by karl.sebby70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1151 users visited in the last hour