Question: Input specific output filenames
0
gravatar for nmaki
12 months ago by
nmaki30
nmaki30 wrote:

I'm working on an implementation of a fastqc wrapper written in CWL, and am wondering if there is a better way to ensure that the log outputs of my script are unique (instead of having to name them manually).

outputs: 
  output_qc_report_file
    type: File
    outputBinding:
      glob: "*_fastqc.zip"
  console_log:
    type: stdout
  error_log:
    type: stderr

Instead of fastqc_con.txt for console log, it would be something like "sample_id-tool_ran.txt" or "22_TO_04_S38_R2_001-fastqc_con.txt.

rna-seq cwl • 430 views
ADD COMMENTlink modified 12 months ago by Tom530 • written 12 months ago by nmaki30
2

Don't know about the exact capabilities of CWL, but adding .log to the name of your sample is one of common solutions. Adding a time stamp would be another.

ADD REPLYlink written 12 months ago by Mensur Dlakic6.4k
4
gravatar for Tom
12 months ago by
Tom530
Tom530 wrote:

See https://www.commonwl.org/user_guide/05-stdout/index.html

You can choose any name you want for the file created from stdout. Using a javascript expression, you can easily derive the name from other parameters. For your case (using an inputs section i made up) this might look like

inputs:
  reads:
    type: File
    inputBinding:
      position: 2
outputs: 
  output_qc_report_file
    type: File
    outputBinding:
      glob: "*_fastqc.zip"
  console_log:
    type: stdout
stdout: $((inputs.reads.nameroot)+"_console_log.txt")

Just in case it is useful to you: Here is a fastqc tool wrapper i use. However, it does not capture stdout.

ADD COMMENTlink written 12 months ago by Tom530
2

A simpler variation of the stdout:

stdout: $(inputs.reads.nameroot)_console_log.txt
ADD REPLYlink written 12 months ago by Michael R. Crusoe1.8k
1

Thank you for the response! It has helped a bit, I now get a complete run of my script with fastqc and log output. I ended using something similar to the solution you posted:

stdout: $((inputs.seqfile.basename)+"_console_log.txt")

However, the logs themselves are named undefined_console_log.txt and undefined_error_log.txt respectively.

ADD REPLYlink written 12 months ago by nmaki30
1

Please post the complete tool, i'm sure we can solve this.

ADD REPLYlink written 12 months ago by Tom530
1

Since the code is too long to write out, and copy and pasting is giving me formatting errors, I am going to post screenshots of it

https://photos.app.goo.gl/5PERRYeA9m11qT126

Thank you for your help, I really appreciate it!

ADD REPLYlink modified 12 months ago • written 12 months ago by nmaki30
1

Hi again!

inputs.seqfile is an array. While the elements in the array probably have a basename, the array itself does not. Would taking the name of the first file in the array be sufficient for your purposes?

inputs.seqfile[0].basename

Also note that basename includes the file extension, so the output would be something like "mysequence.fastq_console_log.txt". You can use nameroot to get the name without the extension.

Cheers, Tom

ADD REPLYlink written 12 months ago by Tom530
1

Thank you so much, that works perfectly! I'll be integrating this into all of my workflows right away.

ADD REPLYlink modified 12 months ago • written 12 months ago by nmaki30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1071 users visited in the last hour