meaning of CheckM output
1
0
Entering edit mode
3.8 years ago
zhangdengwei ▴ 210

Hi all,

I utilized CheckM to estimate whether my bacteria came from single colony have been contaminated. Here is the result:

  Bin Id                      Marker lineage           # genomes   # markers   # marker sets   0     1      2    3    4    5+   Completeness   Contamination   Strain heterogeneity
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  P14_4_chromosome         k__Bacteria (UID203)           5449        104            58        0     39     32   31   2    0       100.00          131.87             91.24
  H13_5_chromosome         k__Bacteria (UID203)           5449        104            58        0     45     55   2    2    0       100.00          91.18              93.15
  H18_4_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        1    1169    3    0    0    0       99.97            0.15               0.00
  H15_3_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        1    1170    2    0    0    0       99.97            0.33               0.00
  H13_3_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        1    1168    4    0    0    0       99.97            0.09               0.00
  H13_7_chromosome   f__Enterobacteriaceae (UID5162)       88         1207          328        2    1192    12   1    0    0       99.93            1.28              13.33
  P15_2_chromosome   f__Enterobacteriaceae (UID5124)      134         1172          336        1    1169    2    0    0    0       99.90            0.33               0.00
  H12_1_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        2    1170    1    0    0    0       99.67            0.04               0.00
  H14_5_chromosome   f__Enterobacteriaceae (UID5124)      134         1173          336        4    1168    1    0    0    0       99.37            0.04               0.00
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I am a bit confused about the meaning of completeness and contamination. Taking P14_4 as an example, the completeness is 100 while the contamination is 131.87. What do they represent? Besides, are the completeness and contamination based on the Maker lineage? Any advise would be greatly appreciated!

genome assembly CheckM Contigs • 6.9k views
ADD COMMENT
2
Entering edit mode
3.8 years ago
Asaf 10k

ChackM uses single-copy genes to evaluate the completeness and contamination of a genome (or a pseudo-genome). If all the genes are found in the genome then completeness is 100% (since they are all essential proteins). If they appear more than once then it's probably contaminated (because two copies are usually lethal). So for P14_4 you can see that there are 104 markers, 39 of which appear only once, 32 appear twice and 31 three times (2 appear 4 times) so since all the genes are found the genome is probably complete, but since there are multiple copies you are probably looking at 2.3 genomes instead of one (that's 130% contamination), 91.24% of the contamination is probably from another strain of the main bacteria.

So overall most of your genomes look very good, P14_4 is 2.3 genomes and H13_5 is two genomes of two strains.

ADD COMMENT
0
Entering edit mode

Many thanks, Asaf. Your reply is so explicit and really helpful.

ADD REPLY
0
Entering edit mode

Asaf, may I ask one more question? What's the meaning of Marker lineage? Based on my understanding, 2.3 genomes in P14_4 belongs to k__Bacteria but failed to be sub-divided into f__Enterobacteriaceae, right?

ADD REPLY
0
Entering edit mode

Exactly. CheckM couldn't assign this genome into a lower level of taxonomy, potentially because it was contaminated with a bacterium from another phylum.

ADD REPLY

Login before adding your answer.

Traffic: 1855 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6