Question: Average Genome Size in mixed dataset
0
gravatar for Seb_Lopez
4 months ago by
Seb_Lopez10
Paris
Seb_Lopez10 wrote:

So this might sound like a trivial questions Is it possible to calculate the average genome size in a mixed dataset composed of complete (closed) genomes and assemblies? I have read that for assemblies, the only thing one can calculate is the assembly size which is just an approximation of the real genome. I've seen it in some papers, where they report average genome sizes of complete and draft genomes, but can't quite figure out how they do it (or if it is correct) Is there a particular definition of average genome size?

Hope this is the right lace to ask this type of Q.

genome • 146 views
ADD COMMENTlink modified 4 months ago by Vijay Lakhujani3.1k • written 4 months ago by Seb_Lopez10
1
gravatar for Vijay Lakhujani
4 months ago by
Vijay Lakhujani3.1k
India
Vijay Lakhujani3.1k wrote:

Hi

So this might sound like a trivial questions Is it possible to calculate the average genome size in a mixed dataset composed of complete (closed) genomes and assemblies?

As biostars say - No question is too trivial or too "newbie".

I assume that you are taking about metagenomic samples. The answer is no. See why

Average genome size usually means a wise estimation of the genome size of species or a consensus of the genome size of different strains of same organisms. For example, different bacterial strains (same at species level) may have varying genome size. Same is true for bacteriophages which have a broad range of genome size. So generally, it is referred that a xyz organism's genome size range from say, for e.g. 100-150 mb. That is the average size.

I've seen it in some papers, where they report average genome sizes of complete and draft genomes, but can't quite figure out how they do it (or if it is correct) Is there a particular definition of average genome size?

It's usually done by 2 ways- through wetlab techniques like flow cytometry and via computational methods like kmer analysis. See this paper

ADD COMMENTlink modified 4 months ago • written 4 months ago by Vijay Lakhujani3.1k

Hi Vijay,

Thanks for your answer. I am talking about genomes retrieved from different databases such as NCBI, EBI or private databases at my research institution. This is a comparative-genomics oriented type of question id est: I want to compare the genome sizes of different ecotypes (in bacteria I forgot to mention). Your definition fits well.

Thanks for the explanation and the paper! I will explore the subject a bit.

ADD REPLYlink written 4 months ago by Seb_Lopez10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1423 users visited in the last hour