Question

Baseline Configuration Of A Bioinformatics Server

3

Entering edit mode

15.2 years ago

Biomed 5.0k

Hi, I am in the process of defining a baseline server for bioinformatics analysis of next gen data. I am not going to be the administrator of this machine so here is the list I came up with for the initial configuration. This machine is going to be used for post alignment and variant calling steps but it would also be nice to have that capacity as well. What other software packages and tools or modifications do you think is necessary for a good bioinformatics server?

Linux OS Apache Server Php MySql PostgreSQL >optional but I suspect we may need it in the near future Perl Python 2.6x , 3.1 R Bioperl, biopython and bioconductor for R can be installed by us or for us

Media wiki installation > we can do this install or we need to be able to have full access to its administration

Access to apache configuration Admin rights for mysql database

Ports: SSH, 80 for intranet

server • 6.9k views

ADD COMMENT • link updated 11.3 years ago by Biostar 20 • written 15.2 years ago by Biomed 5.0k

1

Entering edit mode

then read this article: http://www.nature.com/news/2010/100428/full/4641260a.html (Cybersecurity: how safe is your data? from NatureNews)

ADD REPLY • link 15.2 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

do you mean a web server, or a service server, e.g. a place where users can login to do their analysis

ADD REPLY • link 15.2 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

both. but more a web server since we already have service servers that we have access to to do our own analysis with scripts etc. but we want to make the analysis results available to the end user mainly as data table, search and filter etc. functions. but we also want to be able to make some analysis as well.

ADD REPLY • link 15.2 years ago by Biomed 5.0k

Ram · Answer 1 · 2010-05-13

After reading your question and comments, in particular,

more a web server ... we want to make the analysis results available to the end user mainly as data table, search and filter etc. functions.

I believe that you will find the GALAXY server interesting.

You can simply put your data on the server as downloadables on FTP. Another way is to set up a BioMart instance on your server to allow "filter" and "extraction" of specific parts of your data.

Depending on the data types, you may also want to install web-accessible visualization programs if you wish to extend the functionalities. Check a previous post here.

score 4 · Answer 2 · 2010-05-12

4

Entering edit mode

15.2 years ago

Istvan Albert 102k

What I would recommend is that you acquire a system with a lot of RAM and a lot of processors. Also ensure getting redundant storage and some type of tape backup. Installing some of the software that you mention is very easy with a package manager.

Installing scientific libraries such as LAPACK, boost, FFT can more difficult and may need system administration skills as those can have conflicting requirements.

ADD COMMENT • link 15.2 years ago by Istvan Albert 102k

0

Entering edit mode

Hi thanks for the answer. The machine has a lot of processing power, ram and storage. I also have access to cluster resources so I am not hugely worried about that part but I am mainly interested in the software part.

ADD REPLY • link 15.2 years ago by Biomed 5.0k

score 4 · Answer 3 · 2010-05-13

4

Entering edit mode

15.2 years ago

Neilfws 49k

EMBOSS is an excellent bioinformatics suite for any server and has several options for providing a web interface.

Access to R functions can be provided using RApache.

ADD COMMENT • link 15.2 years ago by Neilfws 49k

score 1 · Answer 4 · 2010-05-17

If you planning to use the server for providing the genome data via browser you should get GBrowse ( tutorials/HOW-TO available here : http://gmod.org/wiki/Next_Generation_Sequencing ) or a similar genome browser on the server. Also SAMTools and related tools will be useful. Other options will be BLAST executable, HMMER3 etc. depending up on the level of data access you are planning to provide for the users.