Best software to assemble bacterial genomes
7
13
Entering edit mode
9.4 years ago
fhsantanna ▴ 610

I have sequencing data of five bacteria, which were generated using Illumina MiSeq. Four of them were sequenced using a paired-end 2x300 protocol and one was sequenced using the nextera mate-pair protocol.

My question is: What are the softwares that you recommend me to assemble these genomes (the largest has almost 8 Mbp)?

I have access to a CLC Workbench. It seems quite ease to use, but I dont know if it is the best one. Most of papers that I found that evaluate the performance of assemblers from two-three years ago.

I also have to mention that I have two i7 with 8 threads PCs available for this objective (one with 32 and another with 8 Gb RAM).

Thanks in advance.

illumina assembler assembly bacteria • 16k views
ADD COMMENT
1
Entering edit mode

Not exactly what you're looking for, but this guide was very useful for me: Beginner’s guide to comparative bacterial genome analysis using next-generation sequence data.

ADD REPLY
0
Entering edit mode

I already read this paper. Very good. They recommended Velvet, but I believe that there are better options. Thank you anyway.

ADD REPLY
0
Entering edit mode

I like using CLC Workbench. Even if you don't use their assembler you can do the last steps in CLC, it's much more convenient. Try SPAdes, SOAPdenovo and you can compare it to CLC built-in assembler.

ADD REPLY
9
Entering edit mode
9.4 years ago
iraun 6.2k

You should read in the literature to know which one is the best one for you specific data. Here you have a nice paper comparing some assembly tools, and it is a recent paper (2014): http://genomebiology.com/2014/15/3/R42

In my opinion SOAPdenovo2 and SGA are a good choice. Bambus is quite difficult to install and to understand. SPACE also is nice, but if you want to use the last version you have to pay so...

Hope it helps.

ADD COMMENT
8
Entering edit mode
9.4 years ago
rtliu ★ 2.2k

For bacterial genome, GAGE-B paper (2013) compare 8 genome assemblers:

  • ABySS v1.3.4
  • CABOG v7.0
  • MIRA v3.4.0
  • MSRCA v1.8.3
  • SGA v0.9.34
  • SOAPdenovo2 v2.04 + GapCloser v1.12
  • SPAdes v2.3.0
  • Velvet v1.2.08

All GAGE-B data and assembly recipe are available here.

For more recent comparison of genome assemblers, have a look here.

As each bacteria genome size and GC% is different, you need to check these reproducible Benchmarks.

ADD COMMENT
1
Entering edit mode

I will second the recommendation of http://nucleotid.es/

It compares numerous assemblers on microbial genomes, with objective metrics as reported by Quast (a tool for evaluating assemblies). And it includes both peak memory usage and CPU-time.

ADD REPLY
9
Entering edit mode
9.4 years ago
lexnederbragt ★ 1.3k

All the articles mentioned conclude with that there is no single best assembler for bacterial genomes. It depends on the genome and the data. So, you'll have to try a few, then validate them using tools such as FRCBam, REAPR or one of the likelihood methods. If you don't care about all this, use SPAdes. If you want a tool that automates most of this, look at iMetAMOS www.cbcb.umd.edu/software/imetamos

ADD COMMENT
0
Entering edit mode

Yes, SPAdes performs very well and it's robust: I would reccomend using the --careful option which, according to the nucleotid.es benchmarks reduces the errors while keeping the same N50.

ADD REPLY
4
Entering edit mode
9.4 years ago
moorem ▴ 240

As mentioned SPAdes is great or check out the A5 Assembly pipeline. Following the full GAGE-B paper it has produced better QUAST results than SPAdes for MiSeq data. A lot depends upon your organism, how repetitive, GC content etc.

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0042304

EDIT: Also, check out Abacas for scaffolding if you have a closely related reference genome.

ADD COMMENT
1
Entering edit mode
9.4 years ago
HG ★ 1.2k

I would like to add Spades may be better choice for bacterial genome assembly.

ADD COMMENT
0
Entering edit mode

SPAdes works well if you have uneven read lengths

ADD REPLY
1
Entering edit mode
9.3 years ago
dago ★ 2.8k

Check out this work that comapre different assembly tools. They introduce a new tool, QUAST, to check the quality of the assembly.

ADD COMMENT
0
Entering edit mode
9.4 years ago
Whoknows ▴ 960

You can also try GATK package, it has numerous features for genome analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 1849 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6