Question: Can't install DESeq2 via Docker container because of dependency issues
0
gravatar for cdastmalchi
7 months ago by
cdastmalchi0 wrote:

I have been trying to install DESeq2 via a Docker container, running it interactively, and then starting an R session where I can manually install the package.

Here is my Dockerfile, which pulls down R version 3.4.1 and installs additional packages on top of it.

FROM r-base:3.4.1

WORKDIR /home

RUN apt-get update && \ 
    apt-get install -y \ 
        build-essential \ 
        gdb \ 
        git \ 
        jags \ 
        libcurl4-openssl-dev \ 
        libopenblas-base \ 
        libopenblas-dev \ 
        libssl-dev \     
        libssh2-1-dev \ 
        libxml2 \ 
        libxml2-dev \ 
        python-dev \ 
        python-pip \ 
        wget \
        sudo

RUN pip install awscli boto3

ENV PATH=$PATH:~/.local/bin/
ADD . /home/
ENV R_THREADS=30

When I run this interactively and start an R session, I first check the R version and notice that it becomes 3.6.1 (even though it was supposed to be 3.4.1) and then I try installing DESeq2 via:

   if (!requireNamespace("BiocManager", quietly = TRUE))
           install.packages("BiocManager")
   BiocManager::install("DESeq2")

When I try this, I get the following response:

ERROR: dependencies ‘cli’, ‘pillar’ are not available for package ‘tibble’
* removing ‘/usr/local/lib/R/site-library/tibble’
* installing *source* package ‘GenomicRanges’ ...
** using staged installation
** libs
ERROR: dependencies ‘reshape2’, ‘tibble’ are not available for package ‘ggplot2’
* removing ‘/usr/local/lib/R/site-library/ggplot2’
ERROR: dependency ‘ggplot2’ is not available for package ‘viridis’
* removing ‘/usr/local/lib/R/site-library/viridis’
/usr/bin/ld: cannot find -lgfortran
collect2: error: ld returned 1 exit status
make: *** [/usr/share/R/share/make/shlib.mk:6: genefilter.so] Error 1
ERROR: compilation failed for package ‘genefilter’
* removing ‘/usr/local/lib/R/site-library/genefilter’
ERROR: dependencies ‘ggplot2’, ‘acepack’, ‘htmlTable’, ‘viridis’ are not available for package ‘Hmisc’
* removing ‘/usr/local/lib/R/site-library/Hmisc’
ERROR: dependencies ‘genefilter’, ‘ggplot2’, ‘Hmisc’, ‘RcppArmadillo’ are not available for package ‘DESeq2’
* removing ‘/usr/local/lib/R/site-library/DESeq2’

So then I try doing install.packages("devtools") to address that first ERROR, but I then run into a train of more dependency issues.

I would also try installing DESeq2 the following way, but it's not compatible with R versions >= 3.5 and my container updates to a later version anyway:

   source("https://bioconductor.org/biocLite.R")
   biocLite("DESeq2")

Does anyone know of a proper way to install DESeq2 (dependencies and all) for either earlier or later versions of R? I need to be able to do so using a Docker container, as i'm trying to automate this program and deploy it on the cloud. Thanks!

UPDATE

I also tried this multi-image Dockerfile, which pulls down an existing Bioconductor-deseq2 image, as well as the Ubuntu image. This container builds successfully, but then when it invokes the entrypoint script run_deseq2.py, it says /bin/sh: 1: Rscript: not found, and in that python script, I have a step where I invoke an R script via a subprocess command. So this means it's not saving the path to R that it got from the first image pulled.

FROM quay.io/biocontainers/bioconductor-deseq2:1.26.0--r36he1b5a44_0

ADD src/setup.R /
RUN Rscript /setup.R 
RUN echo "Done setup."

FROM ubuntu:19.04 

ENV DEBIAN_FRONTEND=noninteractive  

WORKDIR / 

RUN apt-get update && \ 
    apt-get install -y \ 
        python-dev \ 
        python-pip \ 
        wget 

RUN pip install awscli boto3

COPY src/run_deseq2.py /
COPY src/s3_utils.py /
COPY src/job_utils.py /
COPY src/deseq2.R /
COPY src/ModelLoxTag.R /

ENV PATH=$PATH:~/.local/bin/ 
ENV R_THREADS=30 

# Run docker, starting with run script
ENTRYPOINT ["python", "/run_deseq2.py"]
rna-seq docker deseq2 R gene • 556 views
ADD COMMENTlink modified 7 months ago by sviatoslav.kendall770 • written 7 months ago by cdastmalchi0

Here's an existing dockerfile for DESeq2 that may help you restructure your dockerfile. https://hub.docker.com/r/genomicpariscentre/deseq2/dockerfile

I have no experience with R programs and docker, so I'm sorry I can't help out with that part.

ADD REPLYlink written 7 months ago by kapsakcj60
2
gravatar for sviatoslav.kendall
7 months ago by
United States
sviatoslav.kendall770 wrote:

Try using one of Bioconductor's own Docker containers as a starting point and then modify the install.R script to include DESeq2 and any required dependencies. Here's an example of a modified Dockerfile and install.R script I made recently that seems to work OK.

Dockerfile

FROM bioconductor/release_base2

# Helps clean up Docker images
RUN rm -rf /var/lib/apt/lists/*

ADD install.R /tmp/

# invalidates cache every 24 hours
ADD http://master.bioconductor.org/todays-date /tmp/

RUN R -f /tmp/install.R

install.R

pkgs <- c(
#    "OrganismDbi",
#    "ExperimentHub",
#    "Biobase",
#    "BiocParallel",
#    "biomaRt",
    "Biostrings",
#    "BSgenome",
#    "ShortRead",
    "IRanges",
    "GenomicRanges",
#    "GenomicAlignments",
#    "GenomicFeatures",
#    "SummarizedExperiment",
#    "VariantAnnotation",
#    "DelayedArray",
#    "GSEABase",
#    "Gviz",
#    "graph",
#    "RBGL",
#    "Rgraphviz",
    "Rsamtools"
#    "rmarkdown",
#    "httr",
#    "knitr",
#    "BiocStyle"
    )

ap.db <- available.packages(contrib.url(BiocManager::repositories()))
ap <- rownames(ap.db)
fnd <- pkgs %in% ap
pkgs_to_install <- pkgs[fnd]

ok <- BiocManager::install(pkgs_to_install, update=FALSE, ask=FALSE) %in% rownames(installed.packages())

if (!all(fnd))
    message("Packages not found in a valid repository (skipped):\n  ",
            paste(pkgs[!fnd], collapse="  \n  "))
if (!all(ok))
    stop("Failed to install:\n  ",
         paste(pkgs_to_install[!ok], collapse="  \n  "))

suppressWarnings(BiocManager::install(update=TRUE, ask=FALSE))
ADD COMMENTlink written 7 months ago by sviatoslav.kendall770

Thanks! Is it feasible to combine the bioconductor/release_base2 image with like an Ubuntu image so that I can install python packages as well (as I'll be running this on a python-based bioinformatics pipeline)? When I tried a multi-image Dockerfile, it couldn't find R (see update).

ADD REPLYlink written 7 months ago by cdastmalchi0

Also, when I try to run your container above (but with DESeq2 and its dependencies) and then run it interactively, I get:

ERROR: You must set a unique PASSWORD (not 'rstudio') first! e.g. run with:
docker run -e PASSWORD=<YOUR_PASS> -p 8787:8787 rocker/rstudio
ADD REPLYlink written 7 months ago by cdastmalchi0

Sorry, I haven't got much experience calling R scripts from inside Python and I don't know what to make of that error message either.

ADD REPLYlink written 7 months ago by sviatoslav.kendall770

I just got the error message you've posted when trying to run a similarly developed docker image using a command like:

docker run <image>

Whereas I typically run these sorts of docker images interactively with a command like:

docker run -it <image> bash

...and then run an Rscript inside the docker image. I think the underlying problem might have to do with the fact that the Dockerfile does not end with an appropriate CMD function so you might want to add line to the bottom of your Dockerfile like:

CMD ["Script.R"]

Hope this helps!

ADD REPLYlink written 6 months ago by sviatoslav.kendall770
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1549 users visited in the last hour