Tutorial:Using R in Conda
2
15
Entering edit mode
11 months ago

Hi Everyone!!

Being a conda user for almost a year now, I thought of making this short bulletins to keep R installation in conda tidy and error-free for beginners. Here are the following suggestions from my experience

1. Install R in a separate environment. Conda packages requiring my downgrade or upgrade your r-base which might cause version problems. Use conda create -n R to create new environment.
2. Always install r-essentials package rather than conventional r-base to prevent yourself from the horrors curl, curl, zlib, glib brings with them in r-base. conda install -c r r-essentials and also install rstudio to enjoy GUI experience conda install -c r rstudio. You can run the studio by simply typing rstudio on the command line.
3. Three errors are inevitable in R the solutions to which can be found here and here. Generally, the Makeconf file is not empty if you do the installation in a separate environment than the default base environment.
4. Do not update the default packages that come with r-essentials. You are more unlikely to undo all your efforts by doing so since it will update rcurl and everything changing their respective paths.
5. R is particular about the sequence in which you install packages. Maintain a separate file and keep the installation commands saved to use them if the R needs to be reinstalled. Example of the order is here

I invite others to share their experience because installation of R has been a challenge.

R Tutorial • 5.9k views
1
Entering edit mode

HI ,

If we back to the question for why using R with Conda, we are trying to create separate environment for different project so that we are not mess around with R package version.

if this is the reason, why not using renv, just take a look. https://rstudio.github.io/renv/index.html

1
Entering edit mode

that might be a good solution for R, but Conda has an advantage in that it can handle a lot more than just R, or Python. I frequently use Conda to create an entire reproducible software stack, which can include R and Python libraries, in addition to bioinformatics tools, db engines like PostgreSQL, Celery, RabbitMQ, and even nginx.

1
Entering edit mode
6 weeks ago

#### Update 5 May 2021

With time as my knowledge improved, I learned that conda-forge is more reliable than installing from private repositories as it is tested and reviewed thoroughly by the Conda team. So here are new steps to install R.

Also r-studio available on the Anaconda package site downgrades the r-base from 4.0.3 to 3.6 version so I no longer suggest installing r-studio that way. In case if you want to use studio, just activate your R environment in conda and whichever r-base version is available in that environment, studio will pick it up. So same rstudio can run on multiple R versions by switching between different R environments in conda.

Also making a separate R environment is desirable to keep it safe from unintentional upgrade/downgrade of r-base (some tools come with their own r-base version and downgrades r-base).

Finally use mamba for quickly installing r packages and resolving dependencies issues.

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh \
&& chmod +x miniconda.sh && bash miniconda.sh -b -p miniconda

base_dir=$(echo$PWD)

export PATH=$base_dir/miniconda/bin:$PATH
source ~/.bashrc
echo -e "$base_dir/miniconda/etc/profile.d/conda.sh" >> ~/.profile conda init bash  Installing R # installing Mamba for fasta downloading of packages in conda conda install mamba -n base -c conda-forge -y conda update conda -y conda update --all # Creating R environment in conda mamba create -n R -c conda-forge r-base -y  #Activating R environment conda activate R mamba install -c conda-forge r-essentials # Open R and install BiocManager and select a mirror to install the packages from. Use the following if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")  Note: You can not update R-packages installed through r-essentials especially rcurl. So it is still not a very good way of installing R. Packages such as ComplexHeatMap packages and Cairo can not be installed. ADD COMMENT 3 Entering edit mode In my experience, R doesn't play well with conda in part because it always uses R_LIBS_USER when it is set. See this post on the RStudio community forum for some tips to avoid problems. ADD REPLY 0 Entering edit mode You should not need to manually alter the bash PATH variable. That can lead to problems. conda init bash also modifies the path. With that, you will end up with two competing PATH settings. To modify the bash PATH run conda like so: miniconda/bin/conda init bash  then instruct the user the restart their terminal. ADD REPLY 1 Entering edit mode I personally prefer to have full control over what is in PATH and what is not via a simple command. If you run echo 'auto_activate_base: false' >>$HOME/.condarc this will avoid auto-activation of conda base environment upon startup of a new terminal. With conda activate <envname> you can start base (or any) environment, and this will put the respective bin folder into PATH. That makes sure that you are always explicit on what is in PATH or not. You may want to e.g. compile some software outside of conda, but that should require conda being completely turned off to avoid mixing of libs and packages. What will (via the installer afaik) always be in PATH is condabin which contains the conda executable, and this can then be used, as described above, to conveniently turn on/off environments.

0
Entering edit mode

yes, I agree this is how it should work, but default nothing should be activated (not even the base environment).

A lot of problems occur because base is auto-enabled and installing into the base can cause extremely hard to debug problems. Without explicitly activating an environment conda install should refuse to work.

Here is hoping that the system will evolve towards being more explicit rather than "convenient".

That being said, right now for a beginner, it is probably still best to follow the standard conda practices that are better documented at this time. Then new releases can provide a documented migration path.

0
Entering edit mode

Conda introduces a dbus-daemon which conflicts with the system one in some Linux distributions. This results in problems (for example with matplotlib). In my view, this is the result of two bad practices, 1- not using a hard-coded path for systems component in the OS, 2- modifying the user's $PATH on conda's side. ADD REPLY 0 Entering edit mode I explicitly mentioned that for cases where conda init bash doesn't work. I faced such problem on server. ADD REPLY 0 Entering edit mode I do not like this because • the user should not need to restart their terminal or do anything interactive for the scripts/programs/etc. to work besides just running the commands desired • the user should not need to apply permanent modifications to their system environment in order for the scripts/programs/etc. to work • the scripts/programs/etc. should be isolated from the rest of the system as much as possible and shouldnt have side-effects that could potentially break other software on the system I would much rather have conda break and debug why its unhappy, than to have conda apply some unknown mysterious configs to my system and potentially break everything else on the system. I understand that this is a fundamental difference between "the best way to use conda" and the developers intentions on how conda should be used ADD REPLY 0 Entering edit mode 6 weeks ago steve ★ 3.0k (DISCLAIMER: some of the steps described here explicitly go against the official conda installation & usage instructions; beginners should follow the official guides instead before trying any of these steps, and fully understand what these steps are doing before trying them out) Thanks for the detailed notes. I have never found R and R lib installation in conda to be particularly difficult as long as you follow some of these steps & precautions 0). use a fresh new conda installation for each project, dont try to manage multiple conda env's in a single conda installation, you can download Miniconda from here: https://repo.anaconda.com/miniconda/ , yes this wastes some disk space but it saves you a lot of headaches 1). install all the R packages you need up front with a single conda install command without specifying library versions, then let conda pick compatible libraries, then note which versions it chose, then delete the entire conda install and start over with a new fresh conda installation and run conda install again while specifying the exact versions of libraries you want based on the list conda chose for you, 1a). be careful if conda tries to update itself and/or its included Python versions because sometimes this can cause conda to break itself, if that happens then be sure to include args with conda install to lock versions of conda itself and/or Python 1b). make sure the full path to the conda install directory is not too long, because it gets hard-coded into the shebang lines in a lot of the installed files and shebang lines have a size limit of ~127 characters usually https://stackoverflow.com/questions/10813538/shebang-line-limit-in-bash-and-linux-kernel 2). never ever run conda install ever again after the first time unless you absolutely have to (occasionally I've had to install pacakge like ncurses from conda-forge before installing anything else, but thats rare), 3). dont ever bother with conda activate, just update PATH yourself to prepend conda/bin since all your libs will be installed there by default, update and other needed env variables yourself as well, 4). keep all the commands you used for the entire process saved in a script with your project, and use a wrapper-script to correctly set the environment to run your scripts and programs (you might need to unset PYTHONPATH and PYTHONHOME, and apply other env updates that conda activate normally handles. Or if you are adventurous your wrapper script can just call conda activate and pray that it doesn't have side-effects that break something I have been using conda like this for many years with success, notably sticking to conda (Miniconda) distributions for versions 4.5.4 and 4.7.12, your mileage may vary A lot of people seem to have negative sentiment towards conda due to its tendency to try and "take over" your system, the steps described here were developed to try and prevent that while still keeping conda installations reproducible and reliable. Concerns about disk space usage from multiple complete conda installs can be mitigated somewhat by keeping full install scripts associated with each project, example; #!/bin/bash # save as install.sh set -e # download and install conda in the current directory CONDASH=Miniconda3-4.5.4-Linux-x86_64.sh wget https://repo.anaconda.com/miniconda/${CONDASH}
bash "${CONDASH}" -b -p conda rm -f "${CONDASH}"
# set the environment to use the conda you installed
# re-use these configs for wrapper scripts to run your R, Python, etc., scripts
export PATH=${PWD}/conda/bin:${PATH}
unset PYTHONPATH
unset PYTHONHOME
# install the conda packages you wanted
conda install -y somechannel::somepackage==1.2.3


so you can go ahead and delete old conda install directories you are not using anymore and easily recreate them later as needed.