Should I Learn Docker to Run Command Line Bioinformatics Tool?
1
0
Entering edit mode
7 months ago
arriyaz.nstu ▴ 30

I work in a genomics team where we sometimes teach people from other labs how to run bioinfo command line tools in Linux. Almost all of them are Windows users and never experienced using Linux. We usually help them set up Linux in dual boot so that they can run the tools later on their own. We also use conda extensively to install bioinformatics tools.

But, the problem is each of them have different kinds of laptop with different configuration. Especially, the bios configuration varies widely in each laptop. It becomes very challenging, tedious, and sometimes frustrating to set up dual boot.

I was thinking about whether it is possible to use docker to make this installation thing simpler and quicker. I have a superficial idea about docker, but not much. I want to...

  1. Create a docker image that will contain Ubuntu, miniconda, and some bioinfo tools inside the conda.
  2. Other people will be able to build a docker container in their Windows machine and run the tools in the Ubuntu terminal.
  3. They will be able to run bash scripts inside docker in Windows.

Thus, we don't have to mess up with their bios and think about data loss, and all other things. I also thought about Windows Sub System for Linux, but some old laptops don't seem to support it. On the other hand, VirtualBox is not also good solution for low-spec laptops.

I am thinking of a solution that will work for all types of laptops ( for Windows users, plus Mac if possible).

Is it possible that Docker will solve all of my requirements? Is it worth to learn Docker for this purpose?

Docker • 1.3k views
ADD COMMENT
0
Entering edit mode

Just 1 thing to consider.

Docker is great because it provides a way to containerize applications.

HOWEVER, many people on computational clusters cannot use it as it requires "sudo" privilege and/or being in a special "Docker" group.

Hence, you may also want to consider Singularity which is preferred for servers/clusters.

https://sylabs.io/

ADD REPLY
1
Entering edit mode

Singularity is a good option but OP explicitly mentions that their users are using their own laptops. On laptops, it's a lot easier to use Docker vs Singularity (at least on macbooks). Plus, singularity works almost seamlessly with docker so I don't think there's a need to push OP towards singularity.

ADD REPLY
0
Entering edit mode

Fair enough

ADD REPLY
2
Entering edit mode
7 months ago
ATpoint 82k

Docker is great, I use it daily since you're independent of the host computer and can share and pull the images via DockerHub. It's also great for version control and reproducibility.

Create a docker image that will contain Ubuntu, miniconda, and some bioinfo tools inside the conda.

Good idea, but don't make it from scratch. It already exists ready to use, for example: https://hub.docker.com/r/condaforge/mambaforge

Other people will be able to build a docker container in their Windows machine and run the tools in the Ubuntu terminal.

Yes, despite I encourage to have WSL2 going on a Windows machine. Docker on Windows will use a Linux virtual machine anyway since it depends on Linux, so you should have WSL2 running anyway.

They will be able to run bash scripts inside docker in Windows.

If you spin up a container you have Linux, so yes. Again, having Linux on Windows is easiest and most powerful via WSL2, no need for dual boot or anything. With WSL2 you can still do all the office stuff on native Windows (like Word, Excel, Mail) and do the actual work on Linux.

It's definitely worth the time to learn WSL2 and Docker.

ADD COMMENT
0
Entering edit mode

If the laptop doesn't support WSL2, then will the Docker (with linux) work?

ADD REPLY
0
Entering edit mode

Never tried, so idk. Why wouldn't it support it?

ADD REPLY
0
Entering edit mode

#TIL mambaforge - better than continuum's conda images I'm guessing?

ADD REPLY
0
Entering edit mode

I never tried conda images but since mamba > conda I guess mambaforge is better.

ADD REPLY
0
Entering edit mode

Isn't it bad practice to work within a docker container as if it's a virtual machine? I've always been taught to have one container for one task (e.g. STAR) that has a simple I/O.

ADD REPLY
0
Entering edit mode

Yes it is not good practice to treat docker containers as VMs, as containers are built lean - they (ideally) contain just the minimum stuff required to run the tool the container is built around. However, it is theoretically possible to run the container and use it as the working environment if the "host" filesystem is mounted well. Also, it is possible to build a thicker image and have it serve as a quasi VM. It's not a great way to do things, but it minimizes work for the OP.

If users install additional libraries on the container, reproducibility takes a hit but that's not impossible to address. Ultimately, it's a simpler solution than building an HPC and is also scalable if OP decides to move to, say, AWS ParallelCluster in the future.

ADD REPLY
0
Entering edit mode

Why would that be an advantage?

ADD REPLY

Login before adding your answer.

Traffic: 2857 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6