2.9 years ago by
France / Toulouse / GeT-Plage
I have also performed genome assembly using PacBio data and Canu assembler, and I was really satisfied of it.
If you want to try something else, you can try Falcon assembler, proposed by PacificBiosciences ( https://github.com/PacificBiosciences/FALCON ). Falcon is aiming to output a diploid assembly, where heterogeneous regions of the genome are outputted in a different file. I'm just warning you that PacBio tools are actually being deeply changed (they want to leave the bas/bax//cmp.h5 files extensions to propose classic fasta/sam/bam files.
The tools from PacBio, where Falcon belong, are quiet complicated to install. The two classic way are to download from github all the dependencies by yourself (hard way), or to use they tool called pitchfork (but I won't recommend you that, PacBio engineer themselves call that "the painfull way"...).
If you want to use PacBio tools in command line, I recommend to follow theses steps I have recommended to someone else (who was struggling on installation) on github : https://github.com/PacificBiosciences/pbalign/issues/67#issuecomment-272964848
As your genome is small enough( 2Mb that's it ?), you can also try assembly through SMRT Portal using for example HGAP 3 protocol.
pbalign and quiver are very important, because with PacBio assembly, the error rate after assembly is still around 1%. You can lower this error rate using your raw reads, this step is called polishing. You can use pbalign + quiver for that.
If you have some questions about polishing or tools installation, I can help you, I've been through the same steps !
modified 2.9 years ago
2.9 years ago by
Rox • 1.1k