Docker image to use Rscript
1
1
Entering edit mode
6 months ago
hamarillo ▴ 70

Hi! I am having some issues combining docker with snakemake workflows.

I have some rules that involve running an R script on the input(s) and producing some output(s). Until now, I've also been able to solve the environments for these R scripts with conda environments , however, I'm now at a point where conda envs are not cutting it (i.e. the libraries are not available on conda repositories, or the dependencies between them are just impossible to solve unless I do it). So, I turned to docker.

Suppose I have an R script that uses the libraries plotly and dplyr, so I create a docker image using R base as base and install the libraries in it.

# Use an official R runtime as a parent image
FROM r-base:latest

# Install necessary R packages or any other dependencies if needed
RUN R -e "install.packages('plotly')"
RUN R -e "install.packages('dplyr')"

Then I build the docker image: docker build -t my-r-image .

Tag it: docker tag my-r-image hamarillo/my-r-image:dev

Push it: docker push hamarillo/my-r-image:dev

And finally use it in my rule:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

rule generate_plotly_visualizations:
    '''
    Rule for sophisticated R-based visualization using plotly
    '''
    input:
        processed_data="results/analysis/processed_data.csv"
    output:
        plotly_visualization="results/visualizations/plotly_visualization.html"
    container:
        "docker://hamarillo/my-r-image:dev"
    script:
        "../scripts/generate_plotly_visualization.R"

Now, things work out fine, snakemake downloads and builds the image just fine. However, my script doesn't run and the first error is that the libraries are not available, specifically the first library call library(plotly), so it definitely fails from the beginning.

Now, I think the problem is that snakemake activates the container, and the first thing it does when you use the script: directive with an R script is Rscript --vanilla the_script.R BUT, my container is already inside R because I used an r-base image to create it. So it makes no sense to do Rscript --vanila the_script.R

RuleException:
CalledProcessError in file containers_test/workflow/rules/plotly_visualizations.smk, line 16:
Command ' singularity  exec --home 'containers_test'  containers_test/.snakemake/singularity/411cdb23c5e82208fe7d71e579e251cb.simg bash -c 'set -euo pipefail;  Rscript --vanilla containers_test/.snakemake/scripts/tmpitwfgta6.generate_plotly_visualization.R'' returned non-zero exit status 1.

Please help. Has anyone used docker containers with snakemake before for one specific rule (not one container for the whole workflow) to run R scripts?

I think I could try to build my image using something other than r-base (e.g. ubuntu or debian) and then install R in it, and then install my libraries in it, and I'm actually trying that out next, but I was wondering if it can be helped (I'd like to keep the image as small as possible).

Thanks for reading!!! and any help is appreciated

snakemake docker Rscript • 825 views
ADD COMMENT
1
Entering edit mode

BUT, my container is already inside R because I used an r-base image to create it.

Maybe try setting a different entry point for your image? That way, your image will expose a shell where you can call Rscript.

ADD REPLY
0
Entering edit mode

Please show the exact Snakefile, command to run snakemake and the precise error message.

ADD REPLY
1
Entering edit mode
6 months ago
hamarillo ▴ 70

OK, the issue was that one of the R libraries was NOT properly installed by the Dockerfile. R error messages are hidden when you're building the docker image. Sorry, this is my first time using Docker!

I did learn something useful tangentially: when you are running a snakemake rule using container: and e.g. something in params: is a symlink, it won't work (symlinks are not followed)

Best,

hamarillo

ADD COMMENT

Login before adding your answer.

Traffic: 1753 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6