7.0 years ago by
San Francisco, CA, Cancer Therapeutics Innovation Group
The broad talks about some genomes >500MB it assembled posted on its allpaths-lg blog. I personally haven't had luck with it yet b/c not all of the modules respect the MAX_MEM_GB (or something similar) argument. I have a 1TB system to work on, but it is shared, so it crashes at one of the modules that looks at how much memory is available and tries to use it all. If you are on your own dedicated system with enough ram then you should be ok. They claim to be fixing that particular issue now so people on shared systems can use the program.
That said the guys at broad have been very quick to answer my questions and offer help when I got stuck going through their documentation. Their support has been a very pleasant experience and they are eager to help get you going which I can't exactly say for BGI. Also in the genome assembly competition which had a ~100MB simulated genome their assembler was one of the best, and they didn't even have a large team people working on assembly QC and post processing like BGI did.
I used Allpaths LG to generate the preliminary assemblies in the Crocodilian genome announcement paper last year http://genomebiology.com/content/13/1/415. Using a combination overlap library, and a reasonable coverage 2kb insert library using a custom protocol that Nader Pourmand at UCSC developed, we were able to achieve a scaffold N50 of 106Kb, and a contig N50 of 28Kb. I have not been involved with the project recently, so I am not sure what the current state of the assemblies are, but Allpaths-LG did do a good job on that genome with the data we had available at the time.