Question: Setting up environment for RNASeq on university HPCC CondaHTTPError: HTTP 000 CONNECTION FAILED
0
gravatar for mahejabeen.nidhi
6 weeks ago by
mahejabeen.nidhi10 wrote:
$ conda install -c bioconda fastqc
Collecting package metadata (current_repodata.json): failed

CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/bioconda/linux-64/current_repodata.json>
Elapsed: -

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
ConnectionError(MaxRetryError("HTTPSConnectionPool(host='conda.anaconda.org', port=443): Max retries exceeded with url: /bioconda/linux-64/current_repodata.json (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object="" at="" 0x7fe37c3ff050="">: Failed to establish a new connection: [Errno 101] Network is unreachable'))"))

I am using the university's HPC system (details of it below) for RNASeq. However, as you can see above, I cannot download packages due to CondaHTTPError. How can I resolve this?

$ conda info

     active environment : None
       user config file : /home/mhnidhi2/.condarc
 populated config files : /home/mhnidhi2/.condarc
          conda version : 4.7.12
    conda-build version : 3.18.9
         python version : 3.7.4.final.0
       virtual packages : 
       base environment : /opt/ohpc/pub/anaconda3  (read only)
           channel URLs : https://conda.anaconda.org/bioconda/linux-64
                          https://conda.anaconda.org/bioconda/noarch
                          https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /opt/ohpc/pub/anaconda3/pkgs
                          /home/mhnidhi2/.conda/pkgs
       envs directories : /home/mhnidhi2/.conda/envs
                          /opt/ohpc/pub/anaconda3/envs
               platform : linux-64
             user-agent : conda/4.7.12 requests/2.22.0 CPython/3.7.4 Linux/3.10.0-957.el7.x86_64 centos/7.6.1810 glibc/2.17
                UID:GID : 1311:1001
             netrc file : None
           offline mode : False
http rna-seq conda connection hpc • 146 views
ADD COMMENTlink written 6 weeks ago by mahejabeen.nidhi10
1

You should talk to your cluster admin about the connection error.

ADD REPLYlink written 6 weeks ago by ATpoint36k
1

Another way to resolve these conda-related connection issues (if it isn't just an intermittent issue) is to build the environment in a Singularity container in your local machine (laptop/desktop) where you have control over the connection. Then move that container to HPC.

ADD REPLYlink written 6 weeks ago by bruce.moran830

That is an amazing idea! I think I will try to do that. I am new to working with HPCC. Do you have a git repo on building containers like that or other HPCC functions? Thank you again!

ADD REPLYlink written 6 weeks ago by mahejabeen.nidhi10
1

Do your self a favor and solve the underlying problem with your cluster admin. If you are going back and forth between a local machine and a HPC you have to push a new container to the HPC everytime you want to install a new package, this must be laborious.

ADD REPLYlink written 6 weeks ago by ATpoint36k

Very fair point. I already emailed the admin. Hopefully he can resolve this issue. Thank you!

ADD REPLYlink written 6 weeks ago by mahejabeen.nidhi10
2

If your cluster does not have direct/external internet access then many things will not work. Perhaps you need to use a proxy. Again this is local info you would need to find out.

ADD REPLYlink written 6 weeks ago by genomax85k

Resolving with admin is best course, my suggestion was an alternate in case that didn't work.

Re: labouriousness, I think it depends on your use case. If you have a pipeline you've already developed and know the required packages then containers are equally time-economical. For development a conda env is ideal, and that can be packaged inside a container once 'complete'. Benefits of that containerised version is in production/certification setting, and also specifically using NextFlow which takes advantage of HPC IME.

I haven't got any guides on how to containerise conda envs, I'll post one if you like?

ADD REPLYlink written 6 weeks ago by bruce.moran830
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 999 users visited in the last hour