Question

Are package downgrades a necessary evil in Conda?

1

Entering edit mode

17 months ago

4galaxy77 2.8k

I am just trying to get my head around using conda environments.

I created a conda environment for a project containing plink2, plink, R and bcftools.

When I installed plink, using mamba install -n autozygosity -c conda-forge plink, I got the output:

  Package         Version  Build               Channel                    Size
────────────────────────────────────────────────────────────────────────────────
  Install:
────────────────────────────────────────────────────────────────────────────────

  + plink       1.90b6.21  hec16e2b_2          bioconda/linux-64           7MB

  Change:
────────────────────────────────────────────────────────────────────────────────

  - curl           7.86.0  h7bff187_1          conda-forge
  + curl           7.86.0  h2283fc2_1          conda-forge/linux-64     Cached
  - krb5           1.19.3  h3790be6_0          conda-forge
  + krb5           1.19.3  h08a2579_0          conda-forge/linux-64     Cached
  - libcurl        7.86.0  h7bff187_1          conda-forge
  + libcurl        7.86.0  h2283fc2_1          conda-forge/linux-64     Cached
  - libnghttp2     1.47.0  hdcd2b5c_1          conda-forge
  + libnghttp2     1.47.0  hff17c54_1          conda-forge/linux-64     Cached
  - libssh2        1.10.0  haa6b8db_3          conda-forge
  + libssh2        1.10.0  hf14f497_3          conda-forge/linux-64     Cached
  - python         3.11.0  h582c2e5_0_cpython  conda-forge
  + python         3.11.0  ha86cf86_0_cpython  conda-forge/linux-64     Cached
  - r-openssl       2.0.4  r42hfaab4ff_0       conda-forge
  + r-openssl       2.0.4  r42h1f3e0c5_0       conda-forge/linux-64     Cached

  Upgrade:
────────────────────────────────────────────────────────────────────────────────

  - openssl        1.1.1s  h166bdaf_0          conda-forge
  + openssl         3.0.7  h166bdaf_0          conda-forge/linux-64     Cached

  Downgrade:
────────────────────────────────────────────────────────────────────────────────

  - bcftools         1.16  hfe4b78e_1          bioconda
  + bcftools          1.8  h4da6232_3          bioconda/linux-64         794kB
  - htslib           1.16  h6bc39ce_0          bioconda
  + htslib            1.9  h4da6232_3          bioconda/linux-64           1MB

  Summary:

  Install: 1 packages
  Change: 7 packages
  Upgrade: 1 packages
  Downgrade: 2 packages

  Total download: 9MB

This is kind of annoying since I would like to use some of the more recent features in htslib/bcftools and 1.8 is a pretty old version.

Are these kind of downgrades just a necessary part of using conda or is there something to be done? I guess one option is to create a new environment just for plink, but this seems like it could get messy quite quickly!

environments conda • 1.3k views

ADD COMMENT • link updated 17 months ago by GenoMax 141k • written 17 months ago by 4galaxy77 2.8k

score 4 · Answer 1 · 2022-11-16

4

Entering edit mode

17 months ago

i.sudbery 19k

Sometimes downgrades are neccessary. However, its perfectly possible to have both plink 1.90b2 and bcftools 1.16 in the same environment, so there must be something lese that is preventing this.

The only depedency of plink is libgcc-ng>=10.3.0, which is not even that recent.

However, in general it is usually easy to have as few tools as possible in each environment you use, to reduce the probability of things like this. Personally, I'm terrible at this, and like my whole pipeline for a given project to run in a single environment, but many people have a seperate env for each tool

ADD COMMENT • link 17 months ago by i.sudbery 19k

0

Entering edit mode

Thanks for the answer. but many people have a seperate env for each tool - how would this work - can you load multiple environments at the same time?

ADD REPLY • link 17 months ago by 4galaxy77 2.8k

1

Entering edit mode

No. You load one env. run the tool. Load the other env, run that tool etc.

THere is also now "conda run" I believe that will run a command in an env.

I think the most common thing is to have an env per commandline statement actually, rather than per tool.

Things like snakemake and Nextflow will even automate the building of an env for a particular step, and caching it for if its needed again.

ADD REPLY • link 17 months ago by i.sudbery 19k

0

Entering edit mode

You load one env. run the tool. Load the other env, run that tool etc

I think of this as a "poor man's" modules system. Something you can control without admin privileges.

ADD REPLY • link 17 months ago by GenoMax 141k

0

Entering edit mode

I like my whole pipeline for a given project to run in a single environment

I agree with this. I create an environment for each project and create additional environments if incompatibilities arise. Specifically, for each project I have a requirements.txt where I list the packages I need and their version. As the project develops, I add or remove packages in requirements.txt and install them with mamba install --file requirements.txt -n my-project-env. One env per tool seems unmanageable to me but I'd like to hear other opinions...

can you load multiple environments at the same time

conda activate has a --stack option that could make that work but again, it seems messy to me.

ADD REPLY • link 17 months ago by dariober 14k

0

Entering edit mode

This is how I do things. But I've now got to the point where the Envs for some of my projects take >90 minutes to build even with mamba, and often fail, so I might have to have a rethink.

ADD REPLY • link 17 months ago by i.sudbery 19k

0

Entering edit mode

It may be interesting to post one of such cases and see what people think... My projects usually have something like ~30 dependencies listed in requirements.txt. Sometimes envs break down and I need to rebuild them but it's usually a matter of minutes. Besides, I'm still not entirely settled on what should go in requirements.txt. For example, if you use ggplot2, do you also list R? (I don't)

ADD REPLY • link 17 months ago by dariober 14k