Question: Next step after the assembling
gravatar for lutra007
3.5 years ago by
lutra0070 wrote:

Dear all,

I have assembled plasmids purified from E. coli DH10B where they had been multiplicated. After the isolation of plasmids a contamination of genomic DNA was reduced using plasmid-safe DNase. Then my samples were sequenced by Illumina and I got paired-end reads (250 bp, the total size of a fragment is 800 bp).

I used SPAdes to assemble the reads. Some of my scaffolds belong to DH10B, so I plan to delete them from my final set of scaffolds. If an alignment to DH10B has the same length as my scaffold does, it's OK to get rid of this scaffold, right? Or I need to check if it has ever been found on plasmids? And what to do if the scaffold aligns with 80% of its length? With just a half?

Unfortunately, DH10B contains mobile elements. IS elements longer than sequenced fragments can't be added to scaffolds so I don't know if both the host and my plasmid carry this element or not. But SPAdes also presents paths, how contigs came from edges and scaffolds were organized from contigs and their edges. And there are IS elements connected to several plasmid scaffolds, thus they were not merged with other edges. These are plasmid IS elements, for sure. And IS element encoded by DH10B didn't connect anything. So maybe I can delete this isolated regions? May I delete all isolated regions even if they do not align to DH10B?

And if SPAdes made a scaffold shorter than the size of a sequenced fragment, it means that the assembler met a deadlock on the graph, so these short scaffolds can be neglected, right? And they are shorter that 800 bp, so they are meaningless too.

And more general question, do you perform any tests on your scaffolds after assembling? Do you check them on presence of adapters (of cause, I trimmed them, but it's better to check once again, right?)? Any other tests?

sequencing next-gen assembly • 1.2k views
ADD COMMENTlink written 3.5 years ago by lutra0070

While I do not have the experience to advice you about the IS issue, I can suggest some standard general checks for you scaffolds:

  1. Yes, adapters often linger and make their way into the assembly contigs sometimes. You can use NCBI's contamination detection pipeline to identify these. Which brings me to my next point:
  2. Contamination - can by by multiple sources, sometimes can assemble into whole contigs or even scaffolds. Again, you can use NCBI's pipeline for detecting these and trimming/eliminating these regions.
  3. Chimeric contigs and possible assemblies.
  4. Short contigs - while they add little value to your assembly, some may be worth including in your final assembly.
  5. Read depth across scaffolds : Be wary of scaffolds that where coverage is too high or too low, they may be a result of contamination.
ADD REPLYlink written 3.5 years ago by shwethacm220

Thank you! I wasn't aware of NCBI's pipeline, I will use it! And I will check the coverage and how my reads map to the scaffolds.

ADD REPLYlink written 3.5 years ago by lutra0070
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2518 users visited in the last hour