Question: Establishing a Core Bioinformatics Facility
gravatar for Sara
2.7 years ago by
Sara 20
Sara 20 wrote:

Hi, We are establishing a bioinformatics core in our institution. The idea is to start with performing 1500 to 2000 whole exome sequencing a year, but planning to run other services in future (Gene regulation, miRNA regulation, Genome variation, etc. ) the budget is not an issue , My Question is where to start with respect to equipment(hardware) , and staff. Thank you - Sara

sequencing core next-gen genome • 1.8k views
ADD COMMENTlink modified 3 months ago by Jeremy Leipzig18k • written 2.7 years ago by Sara 20

budget is not an issue

That does not happen in real world :) Perhaps you were told that so you would sign on.

ADD REPLYlink written 2.7 years ago by genomax69k

Also, unlimited budget is somewhat at odds with the question "where to start"...

ADD REPLYlink written 2.7 years ago by dariober10k

You may be interested in Establishing a Successful Bioinformatics Core Facility Team or Good luck!

ADD REPLYlink written 2.7 years ago by Madelaine Gogol5.1k

See also: Creating A New Bioinformatics Unit

ADD REPLYlink written 2.7 years ago by Israel Barrantes740

Thank you Guys appreciate your help -Sara

ADD REPLYlink written 2.7 years ago by Sara 20
gravatar for Devon Ryan
2.7 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

I think the hardware side of things is easy enough (find out the vendor your IT folks have been working with and spec out a small cluster with them, make sure that you include a backup method of some sort). Regarding personnel, you would at the very least need one staff-scientist level person to oversee things like "are the sequence runs OK?", "let's build a pipeline", "are the results of the pipeline reasonable given the biological questions being posed?" (I'm assuming you'll be doing the full analysis rather than kicking BAM files down to the wet lab folks).

The most important part of all of this is something you didn't mention and that's what the expectations are of you/the core. This needs to be spelled out very early on and very clearly. What typically happens is that a core is set up with goal X in mind and 6 months later you're working on X plus A->Q. This is, to me at least, the most important thing to clear up with all of the stakeholders before you start putting things out for bid or placing job ads (btw, you can do that here).

ADD COMMENTlink written 2.7 years ago by Devon Ryan91k

"kicking BAM files down to the wet lab folks" hahaha xD

ADD REPLYlink written 2.7 years ago by Sukhdeep Singh9.8k
gravatar for Antonio R. Franco
2.7 years ago by
Spain. Universidad de Córdoba
Antonio R. Franco4.1k wrote:

My two cents.. since I lived this experience in my University

If you invest in a huge computer facility, give for sure that you will spend a lot of money (and I mean a lot) and can expect that the computers will become obsolete after a few years. Not to mention the efforts to maintain that service.

In the other side.

Time for huge changes in the NGS world is coming in the short or medium range. We will be using a new generation of sequencers and/or utilities that will require fewer resources. One example is the use of long read sequencers (pacbio, nanopore and the like), or programs like Kallisto that run an alignment in minutes using 1 or 2 Gb of RAM only.

You also need to consider to hire the efforts of a system maintainer

In our case, we put all these things in a balance, and we took the decision of not to spend such a huge amount of money. We are using computers facilities like Amazon EC (you pay for what you use) or supercomputers around us. Amazon EC maintains their own computers, and this is a labor you avoid

ADD COMMENTlink written 2.7 years ago by Antonio R. Franco4.1k

Hi- Thanks for sharing this. I'm curious about Amazon EC.

I have no experience with it but from what I have heard once you are logged in (via ssh I guess?) it looks like a server or cluster like any running some flavour of Linux, right? If so, does it use a scheduler to process your jobs, like LSF or slurm?

Also, when you transfer largish files (fastq, bam etc) is the speed of transfer an issue?

ADD REPLYlink written 2.7 years ago by dariober10k

Yes it does behave like a regular server (amazon EC2). Look at google compute/microsoft azure as well. You may be able to get better prices there. Current limitation is the max amount of RAM one can have with a server. Last I looked at this it was 256GB RAM.

If you are at an institution that has good network connection with your internet provider (and if the cloud provider also has a good peering connection) then you can basically get wire speed for data transfers (you will pay for that though).

ADD REPLYlink written 2.7 years ago by genomax69k
gravatar for i.sudbery
2.7 years ago by
Sheffield, UK
i.sudbery5.0k wrote:

I think a small cluster would start at 10 x 16 core compute nodes, plus a head node, each with 256GB of RAM. Importantly, don't skimp on the storage, particularly if most of the work you are planning to do is exomes, which are particularly space requiring. Make sure you get something that doesn't get slower as it gets busier, so something like Isilon. Connect it all together with at least Gigabit ethernet. Cloud is definitely a possibility, but watch out for the data transfer costs, which more than doubled the quote last time I costed a grant on the cloud.

To run all that you'll need a sys admin. In addition employ at least one high grade, properly experienced bioinformatician (minimum grade is at least the career grade for a research division leader). Other employees depends on what you want from the core. If you just want "you provide sequence, I provide lists of SNPs", then you'll probably be okay with masters level people. However, my experience is that most folks want help interpreting the data as much as analysing it. In this case I'd argue for hiring a bunch of postdoc level people, and basically hawking them out as rent-a-postdocs - spending 30 or 50% of their time on a project for a collaborator over the period of six months to a year for each project. Either pay them well, and offer job security, or offer them a slice of their time to work on projects of their own choosing - you will need to do something to stop the ones that are anygood leaving for jobs with more freedom at the first opportunity.

ADD COMMENTlink written 2.7 years ago by i.sudbery5.0k
gravatar for genomax
2.7 years ago by
United States
genomax69k wrote:

You either are creating a sequencing core lab (that may do needed bioinformatics on the side) or you should consider creating two separate cores. In latter case, one just does sequencing and the other bioinformatics. That way both would be free to follow other opportunities, since 2000 exomes a year will not keep either core completely busy.

Edit: Re-reading your original post it sounds like you are only setting up a bioinformatics core (i.e.sequencing may be done elsewhere). So above may not apply. I will leave the answer here in case both aspects apply.

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by genomax69k
gravatar for chen
2.7 years ago by
chen1.9k wrote:

Since budget is not an issue, obviously you need a set of Illumina HiSeq X Ten, and then build a data center

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by chen1.9k

The HiSeq X Ten System is the most powerful sequencing platform ever created. The system consists of a set of 10 HiSeq X ultra-high-throughput instruments that deliver over 18,000 human genomes per year at the price of $1000 per genome. The HiSeq X Ten makes human whole-genome sequencing more affordable and accessible than ever before.

Sounds like overkill to me, if you are (only) going to run 2000 exomes per year. Also, I assume the "hardware" is more related to servers (and not to sequencers).

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by WouterDeCoster40k

The operative phrase is "budget is not an issue" :)

ADD REPLYlink written 2.7 years ago by Devon Ryan91k

I thought it was 'We are establishing a core bioinformatics facility' ;) Haven't yet seen the OP mention they needed HiSeq's of any flavour :)

ADD REPLYlink written 2.7 years ago by Daniel Swan13k

The more I repeat that in my head the better it sounds. Start ordering PromethIONs and a HiSeq X Ten system then! Give me just a bit of time to finish my PhD and hire me! But perhaps Sara will have some more information for us soon.

ADD REPLYlink written 2.7 years ago by WouterDeCoster40k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1447 users visited in the last hour