Question: Average Genome Size in mixed dataset
0
gravatar for Seb_Lopez
5 days ago by
Seb_Lopez10
Paris
Seb_Lopez10 wrote:

So this might sound like a trivial questions Is it possible to calculate the average genome size in a mixed dataset composed of complete (closed) genomes and assemblies? I have read that for assemblies, the only thing one can calculate is the assembly size which is just an approximation of the real genome. I've seen it in some papers, where they report average genome sizes of complete and draft genomes, but can't quite figure out how they do it (or if it is correct) Is there a particular definition of average genome size?

Hope this is the right lace to ask this type of Q.

genome • 70 views
ADD COMMENTlink modified 5 days ago by Vijay Lakhujani2.7k • written 5 days ago by Seb_Lopez10
1
gravatar for Vijay Lakhujani
5 days ago by
Vijay Lakhujani2.7k
India
Vijay Lakhujani2.7k wrote:

Hi

So this might sound like a trivial questions Is it possible to calculate the average genome size in a mixed dataset composed of complete (closed) genomes and assemblies?

As biostars say - No question is too trivial or too "newbie".

I assume that you are taking about metagenomic samples. The answer is no. See why

Average genome size usually means a wise estimation of the genome size of species or a consensus of the genome size of different strains of same organisms. For example, different bacterial strains (same at species level) may have varying genome size. Same is true for bacteriophages which have a broad range of genome size. So generally, it is referred that a xyz organism's genome size range from say, for e.g. 100-150 mb. That is the average size.

I've seen it in some papers, where they report average genome sizes of complete and draft genomes, but can't quite figure out how they do it (or if it is correct) Is there a particular definition of average genome size?

It's usually done by 2 ways- through wetlab techniques like flow cytometry and via computational methods like kmer analysis. See this paper

ADD COMMENTlink modified 5 days ago • written 5 days ago by Vijay Lakhujani2.7k

Hi Vijay,

Thanks for your answer. I am talking about genomes retrieved from different databases such as NCBI, EBI or private databases at my research institution. This is a comparative-genomics oriented type of question id est: I want to compare the genome sizes of different ecotypes (in bacteria I forgot to mention). Your definition fits well.

Thanks for the explanation and the paper! I will explore the subject a bit.

ADD REPLYlink written 5 days ago by Seb_Lopez10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1505 users visited in the last hour