Setting up environment for RNASeq on university HPCC CondaHTTPError: HTTP 000 CONNECTION FAILED
0
0
Entering edit mode
3.9 years ago
$ conda install -c bioconda fastqc
Collecting package metadata (current_repodata.json): failed

CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/bioconda/linux-64/current_repodata.json>
Elapsed: -

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
ConnectionError(MaxRetryError("HTTPSConnectionPool(host='conda.anaconda.org', port=443): Max retries exceeded with url: /bioconda/linux-64/current_repodata.json (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object="" at="" 0x7fe37c3ff050="">: Failed to establish a new connection: [Errno 101] Network is unreachable'))"))

I am using the university's HPC system (details of it below) for RNASeq. However, as you can see above, I cannot download packages due to CondaHTTPError. How can I resolve this?

$ conda info

     active environment : None
       user config file : /home/mhnidhi2/.condarc
 populated config files : /home/mhnidhi2/.condarc
          conda version : 4.7.12
    conda-build version : 3.18.9
         python version : 3.7.4.final.0
       virtual packages : 
       base environment : /opt/ohpc/pub/anaconda3  (read only)
           channel URLs : https://conda.anaconda.org/bioconda/linux-64
                          https://conda.anaconda.org/bioconda/noarch
                          https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /opt/ohpc/pub/anaconda3/pkgs
                          /home/mhnidhi2/.conda/pkgs
       envs directories : /home/mhnidhi2/.conda/envs
                          /opt/ohpc/pub/anaconda3/envs
               platform : linux-64
             user-agent : conda/4.7.12 requests/2.22.0 CPython/3.7.4 Linux/3.10.0-957.el7.x86_64 centos/7.6.1810 glibc/2.17
                UID:GID : 1311:1001
             netrc file : None
           offline mode : False
RNA-Seq HPC Conda Connection HTTP • 3.7k views
ADD COMMENT
1
Entering edit mode

You should talk to your cluster admin about the connection error.

ADD REPLY
1
Entering edit mode

Another way to resolve these conda-related connection issues (if it isn't just an intermittent issue) is to build the environment in a Singularity container in your local machine (laptop/desktop) where you have control over the connection. Then move that container to HPC.

ADD REPLY
0
Entering edit mode

That is an amazing idea! I think I will try to do that. I am new to working with HPCC. Do you have a git repo on building containers like that or other HPCC functions? Thank you again!

ADD REPLY
1
Entering edit mode

Do your self a favor and solve the underlying problem with your cluster admin. If you are going back and forth between a local machine and a HPC you have to push a new container to the HPC everytime you want to install a new package, this must be laborious.

ADD REPLY
0
Entering edit mode

Very fair point. I already emailed the admin. Hopefully he can resolve this issue. Thank you!

ADD REPLY
2
Entering edit mode

If your cluster does not have direct/external internet access then many things will not work. Perhaps you need to use a proxy. Again this is local info you would need to find out.

ADD REPLY
0
Entering edit mode

Resolving with admin is best course, my suggestion was an alternate in case that didn't work.

Re: labouriousness, I think it depends on your use case. If you have a pipeline you've already developed and know the required packages then containers are equally time-economical. For development a conda env is ideal, and that can be packaged inside a container once 'complete'. Benefits of that containerised version is in production/certification setting, and also specifically using NextFlow which takes advantage of HPC IME.

I haven't got any guides on how to containerise conda envs, I'll post one if you like?

ADD REPLY

Login before adding your answer.

Traffic: 2813 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6