Question: CWL docker command user
0
gravatar for dganiewich
12 days ago by
dganiewich60
Buenos Aires, Argentina
dganiewich60 wrote:

Hi! I am using CWL to run a program, and when I do cwl-runner tool.cwl tool_params.yml, the docker command has a

--user=500:500

When I try running my tool, there is an id error that doesn't allow me to run it correctly (see https://gatkforums.broadinstitute.org/gatk/discussion/comment/59004#Comment_59004).

How can I make it use the correct uid? I checked my etc/passwd file and the user is indeed 500:500 Any ideas what may be going on?

Thank you,

Best,

Daiana

docker cwl cwl-runner • 105 views
ADD COMMENTlink modified 5 days ago by Tom250 • written 12 days ago by dganiewich60

I'm not sure this problem is caused by lacking permissions. But then again i have no experience using GATK. Would you mind sharing the docker container and the command line tool?

ADD REPLYlink written 10 days ago by Tom250
1

Hi Tom! Sorry for the belated response. Here is the command line tool where you can see the docker container:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

baseCommand: 
  - "gatk"
  - "MarkDuplicatesSpark" 

hints:
  DockerRequirement:
    dockerPull: broadinstitute/gatk

requirements:
- class: InlineJavascriptRequirement

inputs:
  inputFileName_markDups:
    type: File
    inputBinding:
      position: 4
      prefix: -I
    doc: One or more input SAM or BAM files to analyze. Must be coordinate sorted.
      Default value null. This option may be specified 0 or more times

  validationStringency:
    type: string
    default: LENIENT
    inputBinding:
      position: 23
      prefix: -VS
    doc: Validation stringency for all SAM/BAM/CRAM/SRA files read by this program. The default stringency value SILENT can improve performance when processing a BAM file in which variable-length data (read, qualities, tags) do not otherwise need to be decoded.

  metricsFile:
    type: string
    default: "metrics.txt"
    inputBinding:
      position: 6
      prefix: -M
    doc: File to write duplication metrics to Required

  createIndex:
    type: string?
    default: 'true'
    inputBinding:
      position: 20
      prefix: -OBI
    doc: Whether to create a BAM index when writing a coordinate-sorted BAM file.
      Default value false. This option can be set to 'null' to clear the default value.
      Possible values {true, false}
outputs:
  markDups_output:
    type: File
    outputBinding:
      glob: output.dedup.bam
    secondaryFiles:
      - .bai

arguments:
- position: 10
  prefix: '-O'
  valueFrom: output.dedup.bam

Thanks, Daiana

ADD REPLYlink written 8 days ago by dganiewich60

Hi Daiana! Did the answer below help you? Please feel free to ask if you have further questions.

Coming from a biology background, it takes me a lot of time (and questions in this forum) to figure some of the bioinformatics stuff out. So i'm very sympathetic to anyone needing more elaborate explanations

Regards, Tom

ADD REPLYlink modified 1 day ago • written 1 day ago by Tom250

Hi Tom, thank you very much for your help! Indeed, I have stuggled a lot with bioinformatic stuff !! I'm sorry for my belated response, but I have yet not been able to make it work... Maybe there is something wrong with my importation of the dockerfile? Just to be sure, my new code has:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

baseCommand: 
  - "gatk"
  - "MarkDuplicatesSpark" 

requirements:
  - class: InlineJavascriptRequirement

hints:  
  - $import: gatk-docker.yml

etc...

And then my gatk-docker.yml is:

class: DockerRequirement
dockerPull: broadinstitute/gatk:4.1.2.0
dockerFile: >
  $import: gatk-Dockerfile

And the gatk-Dockerfile is your dockerfile posted below. Still when I run it, I get the same error... Any ideas on what may be going on? How did you manage to make it work? Would you mind sharing the command and code?

Thank you very much again!

Regards,

Daiana

ADD REPLYlink modified 1 day ago • written 1 day ago by dganiewich60

I just realized that you use the Github file and not the Docker image, maybe that is what I am doing wrong... I'll check and get back to you !

ADD REPLYlink written 1 day ago by dganiewich60

You have to remove the dockerPull: broadinstitute/gatk:4.1.2.0 line from your code. The broadinstitutes docker container is causing the issue, so you don't want to use that one anymore.

I used the Dockerfile i posted below. I usually put my containers on docker hub, but the easiest solution would be to just put the Dockerfile right into the .cwl file. Like so:

cwlVersion: v1.0
class: CommandLineTool

baseCommand: 
  - "gatk"
  - "MarkDuplicatesSpark"

requirements:
  InlineJavascriptRequirement: {}

hints:
  DockerRequirement:
    dockerFile: |
      FROM ibmjava
      ARG GATK_VERSION=4.1.2.0
      RUN apt-get update && apt-get install -y \
          wget \
          unzip \
          python
      WORKDIR /software
      RUN wget https://github.com/broadinstitute/gatk/releases/download/${GATK_VERSION}/gatk-${GATK_VERSION}.zip
      RUN unzip gatk-${GATK_VERSION}.zip
      ENV PATH="/software/gatk-${GATK_VERSION}:${PATH}"
    dockerImageId: my_gatk_container

inputs:
  inputFileName_markDups
[...]
ADD REPLYlink written 18 hours ago by Tom250
1
gravatar for Tom
5 days ago by
Tom250
Bielefeld University, CeBiTec, Germany
Tom250 wrote:

I am pretty sure this is just an issue of spark/docker (see stackoverflow) and not related to CWL. I experience the same issue when trying to use the broadinstitute gatk container to run your command line tool.

The thread on stackoverflow provides several solutions. I made a docker container for gatk using the IBM JDK (as was suggested in the thread) and it seems to solve the problem.

ADD COMMENTlink modified 4 days ago • written 5 days ago by Tom250

The dockerfile:

FROM ibmjava

ARG GATK_VERSION=4.1.2.0

RUN apt-get update -y && apt-get install -y \
                      unzip \
                  wget \
                  python

WORKDIR /software

RUN wget https://github.com/broadinstitute/gatk/releases/download/${GATK_VERSION}/gatk-${GATK_VERSION}.zip
RUN unzip gatk-${GATK_VERSION}.zip

ENV PATH="/software/gatk-${GATK_VERSION}:${PATH}"
ADD REPLYlink written 5 days ago by Tom250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 536 users visited in the last hour