Just wanted people to critique on the method I used for reducing the number of contigs I had using the ion torrant 400 for a bacterial genome. Also if it’s good other people can use it as a guide.
- I had two 400bp run unpaired reads from ion torrent
- I De Novo assembled my reads using torrent assembler and CLC assembler (I had around 150/175 contigs for each run)
- I used CISA contig integrator to combine my torrent and CLC assembled contigs (this produced around 84 contigs for run 1 and 94 for run 2
- I combined these runs using CISA again to produce 49 contigs
- I mapped reads onto the contigs and split some of my contigs, this produced around 57contigs
- I mapped reads onto the contigs and collected the unmapped reads. Then i did DeNovo assembly on the unmapped reads. I then used geneious assemble tool on the contigs generated from the unmapped reads and the 57 contigs previously generated in step 5.
- I managed to reduce the number of contigs to 22 (94% of bases mapped when read mapping)
- I mapped reads and split some contigs that had poor read mapping. This resulted in around 25 contigs
- Then I had a 200bp run which was previously done using the same bacteria, and I managed to reduce the contigs number to around 20 (96% of bases mapped after read mapping) .
- There were two contigs that had poor read coverage so I removed the region I didn’t trust (did not split)
- I mapped reads onto the contigs again and collected the unmapped reads and then did a deNovo on the unmapped reads.
- I combined both contigs (unmapped from step 10 and original contigs) together to get a total of 40 contigs (99% of bases mapped after read mapping)
- I plan on running the contigs onto contiguator, desigining primers and then filling in gaps using sanger sequence
Is this method okay, any criticisim or improvement input would be appreciated
Are 40 contigs okay for sanger sequencing or should I try to reduce the number of contigs further??
Any feedback would be much appreciated.