Forum: Commercial All-In-One Versus Custom Pipeline For Clinical Applications
gravatar for Dan Gaston
5.6 years ago by
Dan Gaston7.1k
Dan Gaston7.1k wrote:

There have been a couple of older posts similar to this topic, although they are older (over 2 years) and lacking in any real comparisons. I am hoping that given how much NGS sequencing has matured in the last few years, in particular clinical applications, some people may have some more concrete experience and info to provide. I am currently part of a working group for a regional hospital that is looking to implement NGS-based testing in their molecular diagnostics laboratory. Will be starting with a bench-top sequencer (either MiSeq or Ion Torrent most likely) and will be dual use (research/clinical). Most clinical applications will likely be in oncology at least initially. As the local bioinformatics expert for human NGS applications I am evaluating what will fit their needs best. I believe there is support for bioinformatics staffing as part of the setup. As a bioinformatician I am leary of unpublished algorithms and black boxes. While commercial packages like CLC Workbench and Nextgene are very easy to use I don't like not knowing what is going on with my data.

Does anyone have any direct experience comparing the results of packages like Nextgene for instance with pipelines using published open-source software? I think open source solutions like bcbio-nextgene are particularly suited for this type of thing. Especially if thinking with long-term scalability and growth in mind.

ADD COMMENTlink modified 5.6 years ago by Charles Warden7.0k • written 5.6 years ago by Dan Gaston7.1k
gravatar for Charles Warden
5.6 years ago by
Charles Warden7.0k
Duarte, CA
Charles Warden7.0k wrote:

It depends - for variant calling, I would probably go open-source all the way. This is what I do for exome-capture data:

BWA alignmment --> Picard remove duplicates (and targeted sequencing QC stats)--> samtools reformat (sort .bam files, create pileup) and QC stats --> VarScan variant calls --> ANNOVAR annotations.

I like CoNIFER best for copy number calls (but I don't think it captures everything) and I haven't been very satisfied with any structural variant callers. So, I would probably only return SNPs and small indels back to patients (in addition to raw data, if they want to explore on their own).

The one thing I like better in CLC Bio is the de novo assembly algorithm. Currently, no published paper that I can point you towards, but I can tell you it is fast and I have liked the contigs the best (even compared to algorithms specifically designed for RNA-Seq - namely, Trinity and Oases / Velvet). I had a viral assembly algorithm that used SSAKE that I liked better, but it was an entire pipeline optimized specifically for herpesvirus assembly:

However, I assume de novo assembly won't be too important for most clinical applications.

I've also analyzed both Illumina (MiSeq and HiSeq) and Proton data, and I would definitely recommend sticking with Illumina. Proton data has more problems.

However, it may be worth noting that this is for research purposes. You may want something else to use for a clinical application. Not much experience with testing tools in this area,but I did notice this published relatively recently.

ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by Charles Warden7.0k

I should add that in my own research I of course have a custom pipeline for analysis of exome sequencing data. It is certainly my preference but I would be interested in hearing the thoughts of some of the commercial applications from anyone who has worked with them.

ADD REPLYlink written 5.6 years ago by Dan Gaston7.1k

And thanks for the link to the very recent article.

ADD REPLYlink written 5.6 years ago by Dan Gaston7.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 596 users visited in the last hour