Question: CWL docker command user
0
gravatar for dganiewich
6 months ago by
dganiewich90
Buenos Aires, Argentina
dganiewich90 wrote:

Hi! I am using CWL to run a program, and when I do cwl-runner tool.cwl tool_params.yml, the docker command has a

--user=500:500

When I try running my tool, there is an id error that doesn't allow me to run it correctly (see https://gatkforums.broadinstitute.org/gatk/discussion/comment/59004#Comment_59004).

How can I make it use the correct uid? I checked my etc/passwd file and the user is indeed 500:500 Any ideas what may be going on?

Thank you,

Best,

Daiana

docker cwl cwl-runner • 397 views
ADD COMMENTlink modified 6 months ago by Tom520 • written 6 months ago by dganiewich90

I'm not sure this problem is caused by lacking permissions. But then again i have no experience using GATK. Would you mind sharing the docker container and the command line tool?

ADD REPLYlink written 6 months ago by Tom520
1

Hi Tom! Sorry for the belated response. Here is the command line tool where you can see the docker container:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

baseCommand: 
  - "gatk"
  - "MarkDuplicatesSpark" 

hints:
  DockerRequirement:
    dockerPull: broadinstitute/gatk

requirements:
- class: InlineJavascriptRequirement

inputs:
  inputFileName_markDups:
    type: File
    inputBinding:
      position: 4
      prefix: -I
    doc: One or more input SAM or BAM files to analyze. Must be coordinate sorted.
      Default value null. This option may be specified 0 or more times

  validationStringency:
    type: string
    default: LENIENT
    inputBinding:
      position: 23
      prefix: -VS
    doc: Validation stringency for all SAM/BAM/CRAM/SRA files read by this program. The default stringency value SILENT can improve performance when processing a BAM file in which variable-length data (read, qualities, tags) do not otherwise need to be decoded.

  metricsFile:
    type: string
    default: "metrics.txt"
    inputBinding:
      position: 6
      prefix: -M
    doc: File to write duplication metrics to Required

  createIndex:
    type: string?
    default: 'true'
    inputBinding:
      position: 20
      prefix: -OBI
    doc: Whether to create a BAM index when writing a coordinate-sorted BAM file.
      Default value false. This option can be set to 'null' to clear the default value.
      Possible values {true, false}
outputs:
  markDups_output:
    type: File
    outputBinding:
      glob: output.dedup.bam
    secondaryFiles:
      - .bai

arguments:
- position: 10
  prefix: '-O'
  valueFrom: output.dedup.bam

Thanks, Daiana

ADD REPLYlink written 6 months ago by dganiewich90

Hi Daiana! Did the answer below help you? Please feel free to ask if you have further questions.

Coming from a biology background, it takes me a lot of time (and questions in this forum) to figure some of the bioinformatics stuff out. So i'm very sympathetic to anyone needing more elaborate explanations

Regards, Tom

ADD REPLYlink modified 6 months ago • written 6 months ago by Tom520
1

Hi Tom, thank you very much for your help! Indeed, I have stuggled a lot with bioinformatic stuff !! I'm sorry for my belated response, but I have yet not been able to make it work... Maybe there is something wrong with my importation of the dockerfile? Just to be sure, my new code has:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

baseCommand: 
  - "gatk"
  - "MarkDuplicatesSpark" 

requirements:
  - class: InlineJavascriptRequirement

hints:  
  - $import: gatk-docker.yml

etc...

And then my gatk-docker.yml is:

class: DockerRequirement
dockerPull: broadinstitute/gatk:4.1.2.0
dockerFile: >
  $import: gatk-Dockerfile

And the gatk-Dockerfile is your dockerfile posted below. Still when I run it, I get the same error... Any ideas on what may be going on? How did you manage to make it work? Would you mind sharing the command and code?

Thank you very much again!

Regards,

Daiana

ADD REPLYlink modified 6 months ago • written 6 months ago by dganiewich90

I just realized that you use the Github file and not the Docker image, maybe that is what I am doing wrong... I'll check and get back to you !

ADD REPLYlink written 6 months ago by dganiewich90
1

You have to remove the dockerPull: broadinstitute/gatk:4.1.2.0 line from your code. The broadinstitutes docker container is part of the issue, so you don't want to use that one anymore.

I used the Dockerfile i posted below. I usually put my containers on docker hub, but the easiest solution would be to just put the Dockerfile right into the .cwl file. Like so:

cwlVersion: v1.0
class: CommandLineTool

baseCommand: 
  - "gatk"
  - "MarkDuplicatesSpark"

requirements:
  InlineJavascriptRequirement: {}

hints:
  DockerRequirement:
    dockerFile: |
      FROM ibmjava
      ARG GATK_VERSION=4.1.2.0
      RUN apt-get update && apt-get install -y \
          wget \
          unzip \
          python
      WORKDIR /software
      RUN wget https://github.com/broadinstitute/gatk/releases/download/${GATK_VERSION}/gatk-${GATK_VERSION}.zip
      RUN unzip gatk-${GATK_VERSION}.zip
      ENV PATH="/software/gatk-${GATK_VERSION}:${PATH}"
    dockerImageId: my_gatk_container

inputs:
  inputFileName_markDups
[...]
ADD REPLYlink modified 5 months ago • written 6 months ago by Tom520
1
gravatar for Tom
6 months ago by
Tom520
Bielefeld University, CeBiTec, Germany
Tom520 wrote:

I am pretty sure this is just an issue of spark/docker (see stackoverflow) and not related to CWL. I experience the same issue when trying to use the broadinstitute gatk container to run your command line tool.

The thread on stackoverflow provides several solutions. I made a docker container for gatk using the IBM JDK (as was suggested in the thread) and it seems to solve the problem.

ADD COMMENTlink modified 6 months ago • written 6 months ago by Tom520

The dockerfile:

FROM ibmjava

ARG GATK_VERSION=4.1.2.0

RUN apt-get update -y && apt-get install -y \
                      unzip \
                  wget \
                  python

WORKDIR /software

RUN wget https://github.com/broadinstitute/gatk/releases/download/${GATK_VERSION}/gatk-${GATK_VERSION}.zip
RUN unzip gatk-${GATK_VERSION}.zip

ENV PATH="/software/gatk-${GATK_VERSION}:${PATH}"
ADD REPLYlink written 6 months ago by Tom520
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 897 users visited in the last hour