Question: Linux Distros Best Suited For Bioinformatics?
15
gravatar for Sat3Lite
7.1 years ago by
Sat3Lite150
Sat3Lite150 wrote:

Hello all,

I wanted to get a poll of what distros are currently popular with researchers working in bioinformatics and related biological data mining fields.

Mainly:

  • What linux distribution do you currently (or have previously) use(d)?
  • What software and language (packs) do you use on a daily basis? -
  • Where does this distro outperform the rest? Where does it fall short?

I'll start; I currently use Ubuntu and have for the past two years. I mainly use vim with BioPerl scripts. Ubuntu, which is debian based makes installing foreign libraries almost painless. It falls short when it comes to software updates.


linux • 22k views
ADD COMMENTlink modified 20 months ago by chromosomegun260 • written 7.1 years ago by Sat3Lite150
3

Please do a search before reposting, besides i think the question is not what distribution would be best for bioinformatics, since it depends what are you working on, none of the distribution may have the software, packages you need. It's like asking what pencil is better for taking class notes

ADD REPLYlink written 7.1 years ago by Raygozak1.3k

See related post: http://biostar.stackexchange.com/questions/9528/which-useful-debian-like-repositories-do-you-know

ADD REPLYlink written 7.1 years ago by Casey Bergman17k

Agree with @raygozak. A lot of this is going to be about personal preference and the needs of the task at hand. The Linux world is huge, and so is the bioinformatics world.

ADD REPLYlink written 7.1 years ago by Alex Paciorkowski3.3k

I would recommend to stay with the recent and frequent updates and upgrades of Ubuntu.

ADD REPLYlink written 7.1 years ago by Palani Kannan60
14
gravatar for Giovanni M Dall'Olio
7.1 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

BioLinux works very well. It is equipped with a lot of bioinformatics-related software, and is based on a Ubuntu system. You can also use an Ubuntu system directly, and use BioLinux as a repository for bioinformatics software.

alt text

ADD COMMENTlink written 7.1 years ago by Giovanni M Dall'Olio26k
1

I would also like to point out http://cloudbiolinux.org/ for those who need high computational power but don't have the power of Greyskull on their side.

ADD REPLYlink written 7.1 years ago by Daniel3.7k

This seems a workstation distro. Any experience with running this on a server? We're currently using Debian for that server, but we might consider switching to something else.

ADD REPLYlink written 7.1 years ago by Joris Meys130
10
gravatar for Fabian Bull
7.1 years ago by
Fabian Bull1.3k
German
Fabian Bull1.3k wrote:

Currently in use: Arch Linux

Software packages:

  • R (statistical analysis)
  • Scala (programming tasks)
  • Perl / BioPerl (simple pipelines)
  • blast / bwa (local alignments)
  • LaTex / Inkscape (presentation and visualization)
  • Processing (Visualization with java)

Pros and cons: (decide yourself what is pro and what is con)

  • Very clean and lightweight
  • Hard to configure
  • Very configurable
  • Best documented linux
  • Your learn how linux works
  • Great community
  • Always current versions of packages
  • Sliding releases

Conclusion:

Perfect distro if you know what you are doing.

ADD COMMENTlink written 7.1 years ago by Fabian Bull1.3k

Big plus: the AUR, where users can upload build scripts that everone can use. There is quite a wide range of bioinformatics applications available http://aur.archlinux.org/

ADD REPLYlink written 7.1 years ago by Michael Schubert6.9k

Will compiling from scratch be an issue here. Or are there binaries?

ADD REPLYlink written 7.1 years ago by Sequer130

The mentioned AUR requires compiling but this is done automatically by some scripts. The binary repositories are not as big as the ones from ubuntu but they are better mantained.

ADD REPLYlink written 7.1 years ago by Fabian Bull1.3k

Compilation and installation are fully automatic for all AUR source packages. If they are not, this is a package bug and should be reported to the maintainer.

ADD REPLYlink written 7.1 years ago by Michael Schubert6.9k
8
gravatar for Pascal
7.1 years ago by
Pascal1.4k
Barcelona
Pascal1.4k wrote:

Ubuntu. I agree with you that it is very comfortable when installing new software (most of them are Debian package friendly). IMHO I think this is the best choice for the time being as it helps not moving away from the main stream.

FYI, Linux Mint is getting very popular, but this is not related to Bio.

BTW, do you know one has written new source code editor since VIM ? ;-)

ADD COMMENTlink written 7.1 years ago by Pascal1.4k
4

you mean emacs?

ADD REPLYlink written 7.1 years ago by Michael Dondrup45k

I'm a big fan of Ubuntu!

ADD REPLYlink written 6.9 years ago by Geparada1.4k
6
gravatar for ALchEmiXt
7.1 years ago by
ALchEmiXt1.9k
The Netherlands
ALchEmiXt1.9k wrote:

Ubuntu and just add the support for BioLinux packages...works quite well for us. This allows us to keep our own custom server and just pull the tools preconfigured from biolinux if we need them.

ADD COMMENTlink written 7.1 years ago by ALchEmiXt1.9k
1

same experience for me

ADD REPLYlink written 7.1 years ago by Manu Prestat3.9k

Also as bootable Live CD and configurable to run inside a VM :-)

ADD REPLYlink written 7.1 years ago by ALchEmiXt1.9k
5
gravatar for Malachi Griffith
7.1 years ago by
Washington University School of Medicine, St. Louis, USA
Malachi Griffith17k wrote:

This article: Linux distributions for bioinformatics: an update describes several options, including DNALinux, a virtual machine appliance for those that want to run Linux within their existing OS.

ADD COMMENTlink written 7.1 years ago by Malachi Griffith17k
1

I don't think that article is up to date anymore, it was written in 2009.

ADD REPLYlink written 7.1 years ago by Michael Schubert6.9k
5
gravatar for Manu Prestat
7.1 years ago by
Manu Prestat3.9k
Marseille, France
Manu Prestat3.9k wrote:

As I just commented, Ubuntu (I prefer LTS in my case) and BioLinux package is the solution I use, and I'm sure many others are good as well.

I think it could be also interesting to know what distributions are NOT good for bioinformaticians. Which distro should we avoid? I think about (it's just an example) CentOS, that a computer network guy (not bio at all) advised to me for its stability. The problem for me was that the software ecosystem in bioinformatics is very active, and unfortunatly it was so annoying to install or compile some simple stuffs that needed Python, Ruby, last gcc... etc. So I gave up CentOS and went back to Ubuntu, which, at least for the LTS, has not more stability problem in my case...

ADD COMMENTlink modified 6.9 years ago • written 7.1 years ago by Manu Prestat3.9k
4
gravatar for Guy
7.1 years ago by
Guy50
Guy50 wrote:

I believe any popular distro should be good enough. And i say popular because most packages and software for linux will be tested thoroughly only for those which are most used.

If it helps- all the PhD students in the lab where i work use Ubuntu.

EDIT: For any serious work i would suggest using an LTS version. The last one being Lucid

ADD COMMENTlink modified 7.1 years ago • written 7.1 years ago by Guy50
3
gravatar for Steffen Moeller
6.9 years ago by
Steffen Moeller30 wrote:

Deep in my heart I wish to say that the choice of your distribution is a complete non-issue. The tools you need should be available as regular packages of your distribution today. If not all, then most of them. And the community should embrace you when you come with your desire for an additional tool to be packaged or when you already come with those packages that you wish to bring back to the (re)distribution for the benefit of all. I tend to think that most if not all community-run distributions now work like that and Bioinformatics packages seem omnipresent for basic analyses in all distros. So just make your pick.

My personal choice was Debian since the community started the distribution all and still controls it. And for anyone submitting a package to Debian, one is happy for every user in one of the many "derivative distros", like Ubuntu or Mint ... or BioLinux. And after all, it is the community that I am after, not only the tools. Some good parts of us want to talk Biology, or Computational Biology for that matter, and what Debian derivative one uses is not of any concern - we can easily run everything everywhere and all bioinformatics package names are identical between the .deb distributions. The latter point is the most important to me: how quickly can others join practically with ideas, or how portable are parts of their workflows on my machine. The .deb world is big and open and collaborative.

So, with Debian we have quite an accepted server platform, BioLinux uses what Debian has and adds what it wants to add to it. Both share a repository with the DebianMed initiative (a somewhat better name IMHO may be DebianUbuntuBioLinuxMint initiative) where packaging efforts are communicated and shared between them. There are annual cross-distributional .deb sprints on Bioinformatics and then one knows each other from conferences over the years. There is no separation of DebianMed from the rest of Debian. When there is e.g. a Java package missing for a Bioinformatics package, it is added to the benefit of all Debian/Ubuntu/Mint/BioLinux users. And that transition is automated for the respective latest releases.

Particularly attractive is now an extension of BioLinux to CloudBioLinux, an Amazon Image auto-mounting relevant public datasets and with a prepared interface for Galaxy, a web-based workflow engine. Neat. There is also a Debian variant of CloudBioLinux. Also neat. Why? I am not completely sure. Most likely for two reasons: a) it is technically simple and b) there is a particular scientific and technical beauty in knowing everything on an image to be buildable, i.e. inspectable, from source code.

The enemy is not the distribution. It is time. Software changes rapidly. And while LTS (long term support) of Ubuntu is great, it becomes increasingly difficult to maintain current bioinformatics software with older Python and other everything. And when someone packages for Debian (where packaging is always performed against the very latest version) then to share that effort with other distros or the same distro's earlier releases is not always straight forward. One needs to backport. From my observation, this backporting is performed nicest by BioLinux.

When equipped with root privileges, then there is the concept of "chroot" environments that lets you install any distribution into any other. For instance, I ran Debian because of all the Bioinformatics it ships within SuSE for many years - including a second X interface that I reached with SHIFT-ALT-F8. So, have whatever you need in some production environment chrooted or just do not touch it once it is up. You can have multiple such hroots in parallel. Also learn about the dpkg tools (the package manager of .deb distros) "hold" option that avoids accidential updates of packages with the advent of a new version.

My personal suggestion is to take any Debian derivative. It does not matter which one it is. You easily change between them, e.g. my desktop I incrementally "downgraded" from Ubuntu to Debian by adding the Debian package repositories and removing the Ubuntu ones. Not that I would ultimately suggest that to everyone, rest assured that compatibility is higher than one might think. For an overview, have a look at this [?]list of "bio" packages[?] in Debian/Ubuntu.

Good luck,

Steffen

ADD COMMENTlink modified 6.9 years ago • written 6.9 years ago by Steffen Moeller30

My personal suggestion would be to try Debian suite which suits you best. Want stability (e.g. in cluster computing env) -- try stable; want fresh versions -- try testing or even unstable to get all new goodness DebianMed brings with it.

chroot solutions/workarounds are indeed under-utilized. I quite often rely on debootstrap + schroot tandem for various reasons, e.g. to run software not supported by the specific Debian suite like http://neuro.debian.net/blog/2011/2011-12-12_schroot_fslview.html

ADD REPLYlink written 6.9 years ago by yarikoptic0
2
gravatar for sklages
7.1 years ago by
sklages60
Berlin, DE
sklages60 wrote:

I do compile most bioinformatic packages on my own. I don't use any repos for bioinformatic software packages. I had a lot of problems with Ubuntu-based distributions; they seem to do a lot "their own way". Compiling packages like Staden is quite a hassle under Ubuntu.

In my hands for compiling from source any RedHat-based distribution is working fine: CentOS is RHEL and thus a bit conservative, but works fine. Privatly and for testing I use bleeding-edge Fedora (at work an in-house developed linux distribution).

My 2p :-) Sven

ADD COMMENTlink written 7.1 years ago by sklages60
2
gravatar for User 9501
7.1 years ago by
User 950130
User 950130 wrote:

On any given day I might/others might git clone and run/write code on CentOS, Mint Debian, OSX and RHEL. The only sane solution is to export PATH=/home/myname/mydir:$PATH on all my boxes and my cluster node, then install the same dependencies across workstations. It works for me, I can control exactly which version of the library I want to use and updating them isn't that hard.

Any linux distribution will do, I like Gentoo or Mint Debian.

Gentoo has an active bioinformatics community. They also have very up-to-date software. Everything is compiled to your liking and hardware. When properly configured I was surprised how much faster my software was with aggressive flags and modern instructions on 4-core cpus. It is a significant speedup compared to the "run anywhere" packages in most distros repos. Numerical libs like ATLAS, BLAS see speedups of 10x or much greater.

Mint Debian/Debian/Ubuntu has a large community, so by extension they have a large bioinformatics community. There are a boatload of external repos that are great. Neurodebian is pretty good. They are all easy to install, not a timesink to update, very stable. In other words they don't get in my way.

ADD COMMENTlink written 7.1 years ago by User 950130
2
gravatar for Andreas Tille
6.9 years ago by
Andreas Tille20 wrote:

You might like to try a look at the packages bundled into official Debian by the Debian Med team. There is some overview available at http://debian-med.alioth.debian.org/tasks/bio These packages are maintained by a constantly growing team of bioinformaticans who are members of the Debian Med team. All packages are available via the official Debian mirrors and as a consequence of this are available in Ubuntu as well. The team is working closely together with the BioLinux team which is based on Ubuntu LTS. This means that packages which are currently only available in BioLinux will be moved directly into Debian and finally also end up in official Ubuntu.

ADD COMMENTlink written 6.9 years ago by Andreas Tille20
1
gravatar for Maxime Lamontagne
7.1 years ago by
Québec
Maxime Lamontagne2.1k wrote:

Personally, I don't like the new interface of Ubuntu (Unity). I prefer something simpler and more similar to windows like Gnome interface. Gnome is more similar to Windows and it's easy to a new student to work on something familiar. Linux Mint is also nearly 100% compatible with Ubuntu software.

ADD COMMENTlink written 7.1 years ago by Maxime Lamontagne2.1k
0
gravatar for 5heikki
3.2 years ago by
5heikki8.1k
Finland
5heikki8.1k wrote:

I'm getting a new workstation. Specs include 2 x Xeon E5-2630 v3 (altogether 32 threads), 128GB RAM, 512GB SSD and 2 x 2TB HDD for storage. I'm thinking I'll either set up the latest Ubuntu LTS or CentOS on it. EOL for the latest Ubuntu LTS is April 2019. EOL for CentOS 7 is June 2024. Both of these are fine. Any strong opinions towards one of these distros or some alternative? I'm not interested in distros with short life cycles. Debian Stable is an option although I'm a little bit worried about the relatively old versions of many of its packages (may bite as dependency problems). I prefer Gnome over KDE but in the end desktop environment doesn't matter that much since my screens will mostly be filled by terminal windows anyway. I'll build most stuff from source and don't see much difference between apt and yum..

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by 5heikki8.1k

Never had a try myself, but check out Bio-Linux (based on Ubuntu LTS)

They added some Bioinformatics software packages but I am not sure how up-to-date the tools are. Have a look at the software list on their webpage.

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by Manuel Landesfeind1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1707 users visited in the last hour