Question: Is it possible to include BBMap in Bcbio-nextgen pipeline?
0
gravatar for elvissober
3.9 years ago by
elvissober20
South Africa
elvissober20 wrote:

Is it possible to include BBMap in Bcbio-nextgen pipeline? Are there any examples and tutorials on that software besides those posted on their website? Thanks.

sequencing soft wg seq • 1.4k views
ADD COMMENTlink modified 3.9 years ago by Brad Chapman9.5k • written 3.9 years ago by elvissober20

I'm currently in the process of writing guides for most of the tools in the BBMap package...  I've finished several, and BBMap is on my list for this week.

ADD REPLYlink written 3.9 years ago by Brian Bushnell16k
2
gravatar for Brad Chapman
3.9 years ago by
Brad Chapman9.5k
Boston, MA
Brad Chapman9.5k wrote:

It's possible to include it, but it would need integration work to include it as a new aligner. Documentation on writing that code is here:

https://bcbio-nextgen.readthedocs.org/en/latest/contents/code.html#aligner

Out of curiousity, why do you prefer BBMap over other integrated aligners in bcbio like bwa mem?

ADD COMMENTlink written 3.9 years ago by Brad Chapman9.5k
1

I'm curious about BBMap as well. Brian seem to be quite a comprehensive resource on Biostars, so maybe he can comment. Things I can glean from the internet:

- BBMap uses a semi global alignment algorithm instead of local Smith-Waterman

- There's this poster where Brian benchmarks on synthetic data with varying mismatches and indel sizes, showing good performance for large gaps in alignment/reference coordinates.

- The SAM output is TopHat compatible - so maybe this is designed for RNAseq applications?

ADD REPLYlink written 3.9 years ago by Matt Shirley9.2k
2

So, yes, BBMap is designed for RNA-seq and DNA-seq, and it outperforms all other aligners I've tested when dealing with long indels (or indels in general, but particularly long ones), accuracy-wise.  It's also good at aligning very-highly-mutated sequences or very low-quality data.  BWA-mem has an advantage in memory use and (usually) alignment speed, though.  And while this aspect is not very important with human resequencing, BBMap has a huge advantage in index-building time over other aligners; this is actually quite important when analyzing large numbers of de-novo assemblies of different organisms with different assemblers and parameters.

 

ADD REPLYlink written 3.9 years ago by Brian Bushnell16k

I just have tried it and it does the job of producing .sam. However could not find a better software than Nextgen.  Does anybody has a clear tutorial on using that ncbio for wgs from .sra to vcf? Thx.

ADD REPLYlink written 3.9 years ago by elvissober20

Need a clear tutorial on how to add tuberculosis microbe to that software and how to run a pipeline to process from a set of .sra files to .vcf and .pdf reports, thx

ADD REPLYlink written 3.9 years ago by elvissober20
1

For processing with sra you'll want to convert into standard fastq format using the SRA toolkit (http://www.ncbi.nlm.nih.gov/books/NBK158900/). In bcbio, you can add a custom reference genome for tuberculosis (https://bcbio-nextgen.readthedocs.org/en/latest/contents/configuration.html#adding-custom-genomes) and then run a standard variant calling pipeline by creating a configuration file (https://bcbio-nextgen.readthedocs.org/en/latest/contents/configuration.html#automated-sample-configuration). I haven't personally called on tuberculosis so don't have any species specific tips but hope this helps for getting started.

ADD REPLYlink written 3.9 years ago by Brad Chapman9.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1966 users visited in the last hour