Question: Automatic Analysis Pipline For Raw Sequenced Data
gravatar for Stevelor
6.5 years ago by
Stevelor310 wrote:

Hey all,

we are trying to establish a fully automatical standard analysis pipeline for our sequenced samples from a HiSeq2000 machine. I want to know what are your experiences about far can we go...which steps are inevitable, which can not be done in an automatic way...and have you also worked on a "genome content management system" like this??? what are your experiences??

So what we have already realized is: For every sequencing run different SQL tables contain the information of each sample, for example the samplesheet casava needs, the sample characteristics, what kind of sequencing strand specific or not, the path where you can find the sample data etc. , insert size (PE), maybe metatranscriptomes or -genomes...and a lot more... We use this information to build an first step casava gets started, afterwards the samples get moved to the corresponding project folder, fastqc as quality control gets executed.

Next steps would be mapping an quantification...All this steps are traceable in a CMS. Every big events creates an automatic post in this CMS.

Would be nice to get some feedback about this.... Is it possible AND useful to create such a pipe?? Cause of course every sample is a bit different...what to you think?!



data next-gen pipeline sequencing • 1.5k views
ADD COMMENTlink written 6.5 years ago by Stevelor310

You're describing a LIMS system. These are not trivial to develop, and not cheap to buy!

ADD REPLYlink written 6.5 years ago by Daniel Swan13k

I think the first step would be to see what's out there (bioteam minilims, galaxy, taverna, stuff built with ruffus or paver) and report your finding in a blog post or the seqanswers wiki

ADD REPLYlink written 6.5 years ago by Jeremy Leipzig17k
gravatar for Roman Valls Guimerà
6.5 years ago by
Roman Valls Guimerà500 wrote:

Hello SteveLor,

To some extent your question reminds me of a previous post @biostar, you might want to have a look at it:

In our lab we're using and extending Brad's pipeline:

There are a few issues that need to be addressed, but overall does the job for us. The Galaxy side, which would greatly help on sample management, is still on the works due to IT security concerns.

Hope that helps !

ADD COMMENTlink written 6.5 years ago by Roman Valls Guimerà500
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1252 users visited in the last hour