I am a newbie to the Genomics domain and am not clear about the significance of de novo assembly. I know that de novo assembly doesn't use any reference template whereas mapping assembly does, but what is the impact of this on the final assembled result? How is de novo assembly better or worse and in what situations to use each of them? Moreover, Mapping assembly looks similar to Sequence alignment, which adds to confusion.
If there is not good reference available you will need to go for de-novo assembly. if a good reference genome is available you might consider going for reference based assembly.
De-novo assembly will on top of that also allow to assemble things that might not be present in the reference. Using a reference you'll always be biased towards things that are in the reference and will thus not find new things.
I think de novo assembly is the more accuarate but you need for that:
-Huge amount of sequences (I have a lot of reads that were acutally suposed to cover only one time the genome… it actually cover only one third, so impossible to make it)
-It needs a lot of memory… for my third of genome, computer answered "I cannot let you do this: it would requier one week with one TB of ram memory"
So: drawbacks are: expensive, time consuming…
If you have an existing genome, you might put any read / sequence on it and it might be quicker and cheaper.
Maybe another way could be from initial reads: make a blast db from it blast the part of the genome you are interested it from the alignments design primers and clone the region you are interested in.
If some regions are closed from one another try to clone between them…
Then, you might have a partial assembly of what you are interested in.
You can also map what ever you have on a genome then combine with what just sugest and feel few gaps with old fation sanger sequencing for getting a reliable assembly of your interested part.
Eventually, if what you what to assemble is repetitive parts, a rad-sequencing form a repetitive part with 2 differant restriction enzymes might provide easy to assemble sequencing.
If you are interested in unique parts, theres new methods making a loop of DNA allowing to target it.
Actually,previously we might tend to use the reference genome if it existed. But now these two strategies are jointly employed to call variations especially Indels or CNVs. You could refer to the GATK HaplotypeCaller methods of reassembly of local activate regions.
Hello, I would like you to check the question what you are asking from your study. You can provide little detail on that here.
You can go for two things
- If your sequenced genome is totally different go for denovo.
- If you going for existing genomes do hybrid (since the species or stain you have sequenced might have gone under changes) Hybrid Assembly = Reference based assembly + denovo assembly (Unmapped reads). Hope you would have got some clarity.