Question: meaning of CheckM output
0
gravatar for zhangdengwei
4 weeks ago by
zhangdengwei70
zhangdengwei70 wrote:

Hi all,

I utilized CheckM to estimate whether my bacteria came from single colony have been contaminated. Here is the result:

  Bin Id                      Marker lineage           # genomes   # markers   # marker sets   0     1      2    3    4    5+   Completeness   Contamination   Strain heterogeneity
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  P14_4_chromosome         k__Bacteria (UID203)           5449        104            58        0     39     32   31   2    0       100.00          131.87             91.24
  H13_5_chromosome         k__Bacteria (UID203)           5449        104            58        0     45     55   2    2    0       100.00          91.18              93.15
  H18_4_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        1    1169    3    0    0    0       99.97            0.15               0.00
  H15_3_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        1    1170    2    0    0    0       99.97            0.33               0.00
  H13_3_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        1    1168    4    0    0    0       99.97            0.09               0.00
  H13_7_chromosome   f__Enterobacteriaceae (UID5162)       88         1207          328        2    1192    12   1    0    0       99.93            1.28              13.33
  P15_2_chromosome   f__Enterobacteriaceae (UID5124)      134         1172          336        1    1169    2    0    0    0       99.90            0.33               0.00
  H12_1_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        2    1170    1    0    0    0       99.67            0.04               0.00
  H14_5_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        4    1168    1    0    0    0       99.37            0.04               0.00
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I am a bit confused about the meaning of completeness and contamination. Taking P14_4 as an example, the completeness is 100 while the contamination is 131.87. What do they represent? Besides, are the completeness and contamination based on the Maker lineage? Any advise would be greatly appreciated!

ADD COMMENTlink modified 4 weeks ago by Asaf8.3k • written 4 weeks ago by zhangdengwei70
2
gravatar for Asaf
4 weeks ago by
Asaf8.3k
Israel
Asaf8.3k wrote:

ChackM uses single-copy genes to evaluate the completeness and contamination of a genome (or a pseudo-genome). If all the genes are found in the genome then completeness is 100% (since they are all essential proteins). If they appear more than once then it's probably contaminated (because two copies are usually lethal). So for P14_4 you can see that there are 104 markers, 39 of which appear only once, 32 appear twice and 31 three times (2 appear 4 times) so since all the genes are found the genome is probably complete, but since there are multiple copies you are probably looking at 2.3 genomes instead of one (that's 130% contamination), 91.24% of the contamination is probably from another strain of the main bacteria.

So overall most of your genomes look very good, P14_4 is 2.3 genomes and H13_5 is two genomes of two strains.

ADD COMMENTlink written 4 weeks ago by Asaf8.3k

Many thanks, Asaf. Your reply is so explicit and really helpful.

ADD REPLYlink written 4 weeks ago by zhangdengwei70

Asaf, may I ask one more question? What's the meaning of Marker lineage? Based on my understanding, 2.3 genomes in P14_4 belongs to k__Bacteria but failed to be sub-divided into f__Enterobacteriaceae, right?

ADD REPLYlink written 4 weeks ago by zhangdengwei70

Exactly. CheckM couldn't assign this genome into a lower level of taxonomy, potentially because it was contaminated with a bacterium from another phylum.

ADD REPLYlink written 4 weeks ago by Asaf8.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 718 users visited in the last hour