Question: Method used for reducing contigs IonTorrant 400bp [unpaired] (critique/ways to improve would be appreciated)
0
gravatar for b.lehri118
5.6 years ago by
b.lehri1180
United Kingdom
b.lehri1180 wrote:

Hi All

Just wanted people to critique on the method I used for reducing the number of contigs I had using the ion torrant 400 for a bacterial genome. Also if it’s good other people can use it as a guide.

  1. I had two 400bp run unpaired reads from ion torrent
  2. I De Novo assembled my reads using torrent assembler and CLC assembler (I had around 150/175 contigs for each run)
  3. I used CISA contig integrator to combine my torrent and CLC assembled contigs (this produced around 84 contigs for run 1 and 94 for run 2
  4. I combined these runs using CISA again to produce 49 contigs
  5. I mapped reads onto the contigs and split some of my contigs, this produced around 57contigs
  6. I mapped reads onto the contigs and collected the unmapped reads. Then i did DeNovo assembly on the unmapped reads. I then used geneious assemble tool on the contigs generated from the unmapped reads and the 57 contigs previously generated in step 5.
    1. I managed to reduce the number of contigs to 22 (94% of bases mapped when read mapping)
  7. I mapped reads and split some contigs that had poor read mapping. This resulted in around 25 contigs
  8. Then I had a 200bp run which was previously done using the same bacteria, and I managed to reduce the contigs number to around 20 (96% of bases mapped after read mapping)  .
  9. There were two contigs that had poor read coverage so I removed the region I didn’t trust (did not split)
  10. I mapped reads onto the contigs again and collected the unmapped reads and then did a deNovo on the unmapped reads.
  11. I combined both contigs (unmapped from step 10 and original contigs) together to get a total of 40 contigs (99% of bases mapped after read mapping)
  12. I plan on running the contigs onto contiguator, desigining primers and then filling in gaps using sanger sequence

Is this method okay, any criticisim or improvement input would be appreciated

Are 40 contigs okay for sanger sequencing or should I try to reduce the number of contigs further??

Any feedback would be much appreciated.

sequencing assembly • 2.2k views
ADD COMMENTlink modified 5.6 years ago by Rayan Chikhi1.4k • written 5.6 years ago by b.lehri1180
1
gravatar for Rayan Chikhi
5.6 years ago by
Rayan Chikhi1.4k
France, Lille, CNRS
Rayan Chikhi1.4k wrote:

I have a few remarks:

  • clc assembler: did you have time to try another assembler? I'm thinking that Spades might have done a much better job, as it's excellent with bacterial genomes and has support for Ion torrent.
  • cisa: it might not be the best tool for the job -- it also create errors, by, I suspect, wrongly merging contigs. There are other more well-known assembly merger tools that would need to be evaluated. What about GAM and  Mix? (http://www.biomedcentral.com/1471-2105/14/S7/S6http://www.biomedcentral.com/1471-2105/14/S15/S16)
  • Reducing the number of contigs should not be the only metric. How did the total assembly size and NG50 change after each step?
ADD COMMENTlink written 5.6 years ago by Rayan Chikhi1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 851 users visited in the last hour