Question: Automatic Analysis Pipline For Raw Sequenced Data
2
gravatar for Stevelor
6.6 years ago by
Stevelor310
Stevelor310 wrote:

Hey all,

we are trying to establish a fully automatical standard analysis pipeline for our sequenced samples from a HiSeq2000 machine. I want to know what are your experiences about this...how far can we go...which steps are inevitable, which can not be done in an automatic way...and have you also worked on a "genome content management system" like this??? what are your experiences??

So what we have already realized is: For every sequencing run different SQL tables contain the information of each sample, for example the samplesheet casava needs, the sample characteristics, what kind of sequencing strand specific or not, the path where you can find the sample data etc. , insert size (PE), maybe metatranscriptomes or -genomes...and a lot more... We use this information to build an workflow...as first step casava gets started, afterwards the samples get moved to the corresponding project folder, fastqc as quality control gets executed.

Next steps would be mapping an quantification...All this steps are traceable in a CMS. Every big events creates an automatic post in this CMS.

Would be nice to get some feedback about this.... Is it possible AND useful to create such a pipe?? Cause of course every sample is a bit different...what to you think?!

Thanks!

Steve

data next-gen pipeline sequencing • 1.6k views
ADD COMMENTlink written 6.6 years ago by Stevelor310
1

You're describing a LIMS system. These are not trivial to develop, and not cheap to buy!

ADD REPLYlink written 6.6 years ago by Daniel Swan13k
1

I think the first step would be to see what's out there (bioteam minilims, galaxy, taverna, stuff built with ruffus or paver) and report your finding in a blog post or the seqanswers wiki

ADD REPLYlink written 6.6 years ago by Jeremy Leipzig17k
1
gravatar for Roman Valls Guimerà
6.6 years ago by
Melbourne
Roman Valls Guimerà500 wrote:

Hello SteveLor,

To some extent your question reminds me of a previous post @biostar, you might want to have a look at it:

http://biostar.stackexchange.com/questions/1269/what-is-the-best-pipeline-for-human-whole-exome-sequencing/9406#9406

In our lab we're using and extending Brad's pipeline:

https://github.com/chapmanb/bcbb/blob/master/nextgen/README.md

http://bcbio.wordpress.com/

There are a few issues that need to be addressed, but overall does the job for us. The Galaxy side, which would greatly help on sample management, is still on the works due to IT security concerns.

Hope that helps !

ADD COMMENTlink written 6.6 years ago by Roman Valls Guimerà500
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 629 users visited in the last hour