Question: still not clear about differences between de novo genome sequencing and genome reseqencing
0
gravatar for Yingzi Zhang
5 months ago by
Yingzi Zhang50
Beijing
Yingzi Zhang50 wrote:

Hi all, I am not clear about differences between de novo assembly and genome resequencing.

I know that if there is no reference genome of the species I am interested in, I need to do de novo assembly to assemble and annotate the species. And if it is already assembled and annotated, I just need to do genome resequencing to analyse the structure variation (or even genes' sequence difference?)

This is a point of the view of their aims. But what about feasibility? I couldn't tell the difference during the reads generation. Though there are BAC clone, etc. to evaluate data during de novo assembly.) I saw that one published de novo genome was assembled with about 300Gb reads of 100 bp x 2 (it's published in 2017, so not old at all. the sequencing depth is about 100x). If one "resequenced" genome derived from 1Tb reads of 100 bp x 2 (good base quality), can I reuse its data to do de novo assembly instead? For example, Burmese Cat and Ragdoll are both cats but they are different cats. Now the genome already de novo -ed and being a reference is of Burmese Cat (from 300Gb reads), "the resequenced genome" is of Ragdoll (from 1Tb reads). Am I able to de novo and annotate the Ragdoll's genome so that other Ragdolls can have a better reference?

Tell me it's absurd if I make any factual errors. Thank you.

Yingzi

sequencing assembly • 219 views
ADD COMMENTlink modified 5 months ago by genomax59k • written 5 months ago by Yingzi Zhang50
2
gravatar for genomax
5 months ago by
genomax59k
United States
genomax59k wrote:

Your basic premise in second para is correct.

That said biology is rarely a linear accounting of nucleotides (just like knowing DNA sequence is not enough to understand how it encodes information that ultimately forms proteins and makes a cell function). A lot would depend of organization of genome you are interested in. If you know nothing about it then that would be your first task. Number of chromosomes, ploidy of the genome (diploid genomes (2 copies) are hard enough but others may be multiploid and become impossible to assemble) and number of repeats there are (which make assembly impossible with just short reads, need long reads like PacBio/Nanopore).

You can use a related genome as a guide for assembly of a new one but if the two species are distinct then you can't really use the related genome as a reference for the new one. It would depend on how far apart they are in evolutionary term.

You must have heard the terminology - $1K genome/$100,000 annotation. You could easily collect a terabase of sequence with technology today. Assembling raw nucleotide information into a usable genome could easily take 100x that in time/money.

ADD COMMENTlink modified 5 months ago • written 5 months ago by genomax59k

There is a highly related genome accessible. I should try use it. I am only allowed to use data from the Internet, so no long reads till now (and I don't think there will be in a short time). Would the absence of long reads be fatal? Am I too confident with my 1T short reads?!

ADD REPLYlink written 5 months ago by Yingzi Zhang50
1

Generally you need to make long read libraries yourself if you have a particular interest in finishing a genome. It is good to be confident but cautious. You can only use data you have at hand so make the best of what you have.

ADD REPLYlink written 5 months ago by genomax59k
1

Many thanks. I feel no more self-doubt but full of strength!

ADD REPLYlink written 5 months ago by Yingzi Zhang50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1128 users visited in the last hour