CWL docker command user
1
0
Entering edit mode
3.5 years ago
dganiewich ▴ 130

Hi! I am using CWL to run a program, and when I do cwl-runner tool.cwl tool_params.yml, the docker command has a

--user=500:500

When I try running my tool, there is an id error that doesn't allow me to run it correctly (see https://gatkforums.broadinstitute.org/gatk/discussion/comment/59004#Comment_59004).

How can I make it use the correct uid? I checked my etc/passwd file and the user is indeed 500:500 Any ideas what may be going on?

Thank you,

Best,

Daiana

cwl docker cwl-runner • 1.7k views
0
Entering edit mode

I'm not sure this problem is caused by lacking permissions. But then again i have no experience using GATK. Would you mind sharing the docker container and the command line tool?

1
Entering edit mode

Hi Tom! Sorry for the belated response. Here is the command line tool where you can see the docker container:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

baseCommand:
- "gatk"
- "MarkDuplicatesSpark"

hints:
DockerRequirement:

requirements:
- class: InlineJavascriptRequirement

inputs:
inputFileName_markDups:
type: File
inputBinding:
position: 4
prefix: -I
doc: One or more input SAM or BAM files to analyze. Must be coordinate sorted.
Default value null. This option may be specified 0 or more times

validationStringency:
type: string
default: LENIENT
inputBinding:
position: 23
prefix: -VS
doc: Validation stringency for all SAM/BAM/CRAM/SRA files read by this program. The default stringency value SILENT can improve performance when processing a BAM file in which variable-length data (read, qualities, tags) do not otherwise need to be decoded.

metricsFile:
type: string
default: "metrics.txt"
inputBinding:
position: 6
prefix: -M
doc: File to write duplication metrics to Required

createIndex:
type: string?
default: 'true'
inputBinding:
position: 20
prefix: -OBI
doc: Whether to create a BAM index when writing a coordinate-sorted BAM file.
Default value false. This option can be set to 'null' to clear the default value.
Possible values {true, false}
outputs:
markDups_output:
type: File
outputBinding:
glob: output.dedup.bam
secondaryFiles:
- .bai

arguments:
- position: 10
prefix: '-O'
valueFrom: output.dedup.bam


Thanks, Daiana

0
Entering edit mode

Coming from a biology background, it takes me a lot of time (and questions in this forum) to figure some of the bioinformatics stuff out. So i'm very sympathetic to anyone needing more elaborate explanations

Regards, Tom

1
Entering edit mode

Hi Tom, thank you very much for your help! Indeed, I have stuggled a lot with bioinformatic stuff !! I'm sorry for my belated response, but I have yet not been able to make it work... Maybe there is something wrong with my importation of the dockerfile? Just to be sure, my new code has:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

baseCommand:
- "gatk"
- "MarkDuplicatesSpark"

requirements:
- class: InlineJavascriptRequirement

hints:
- $import: gatk-docker.yml etc...  And then my gatk-docker.yml is: class: DockerRequirement dockerPull: broadinstitute/gatk:4.1.2.0 dockerFile: >$import: gatk-Dockerfile


And the gatk-Dockerfile is your dockerfile posted below. Still when I run it, I get the same error... Any ideas on what may be going on? How did you manage to make it work? Would you mind sharing the command and code?

Thank you very much again!

Regards,

Daiana

0
Entering edit mode

I just realized that you use the Github file and not the Docker image, maybe that is what I am doing wrong... I'll check and get back to you !

1
Entering edit mode

You have to remove the dockerPull: broadinstitute/gatk:4.1.2.0 line from your code. The broadinstitutes docker container is part of the issue, so you don't want to use that one anymore.

I used the Dockerfile i posted below. I usually put my containers on docker hub, but the easiest solution would be to just put the Dockerfile right into the .cwl file. Like so:

cwlVersion: v1.0
class: CommandLineTool

baseCommand:
- "gatk"
- "MarkDuplicatesSpark"

requirements:
InlineJavascriptRequirement: {}

hints:
DockerRequirement:
dockerFile: |
FROM ibmjava
ARG GATK_VERSION=4.1.2.0
RUN apt-get update && apt-get install -y \
wget \
unzip \
python
WORKDIR /software
RUN wget https://github.com/broadinstitute/gatk/releases/download/${GATK_VERSION}/gatk-${GATK_VERSION}.zip
RUN unzip gatk-${GATK_VERSION}.zip ENV PATH="/software/gatk-${GATK_VERSION}:${PATH}" dockerImageId: my_gatk_container inputs: inputFileName_markDups [...]  ADD REPLY 1 Entering edit mode 3.5 years ago Tom ▴ 530 I am pretty sure this is just an issue of spark/docker (see stackoverflow) and not related to CWL. I experience the same issue when trying to use the broadinstitute gatk container to run your command line tool. The thread on stackoverflow provides several solutions. I made a docker container for gatk using the IBM JDK (as was suggested in the thread) and it seems to solve the problem. ADD COMMENT 0 Entering edit mode The dockerfile: FROM ibmjava ARG GATK_VERSION=4.1.2.0 RUN apt-get update -y && apt-get install -y \ unzip \ wget \ python WORKDIR /software RUN wget https://github.com/broadinstitute/gatk/releases/download/${GATK_VERSION}/gatk-${GATK_VERSION}.zip RUN unzip gatk-${GATK_VERSION}.zip

ENV PATH="/software/gatk-${GATK_VERSION}:${PATH}"